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AC AC C AAT C C AAAAG CGT GG AACT AT GT T AAAAAG C T AC AAC AT AAT AT T 
AAT G CT AT C AAAT CT T C T AG CT C T T C AG AAG T T TAT C AAT C AGT T G C AG A 
AGGAAAAAT GAT T GT GG G GT T GAC T TAG G AAGAC C CT AGT G T C AAT T T G C 
AAAAAAGT G GT G C C AAT G T T T C TAT T GT AT AT CCG AC AGAAG GG AC AGT T 
TTTGTCCCATCTTCGGTTGCAATTATAAAGAATGCTCCTTCTATGAAAGA 
AGCAAAGTTATTTATTAATTTTATGCTTTCTTTAGATGTTCAAAATGCCT 
T T GGG C AGT C AAC GAGT AAC C G AC CT AT T C GT AAAG AT GC C C AAAC GAG T 
AATGGCATGAAAGCTTTAAAGGATATTGCTACTCTTAAAGAAGATTATCG 
CTATGTCACTAAGCATAAGGGCCAAATCCTTAAAACCTATAATCGTATTC 
GTAGAAATGCTGAT 

SEQ ID NO. 6004 

STRAIN H3 6B 

T AAAC T AC T T C C AC C AAAAG AAT T AGT TAT T CT AAGT C C AAAT AG T C AAG 
CCATTTTAACAGGAACGATTCCAGCTTTTGAGGAAAAATACGGTATAAAA 
GT T AAG C T TAT T C AAG G T G GG AC AGGT C AACT AAT AG AT AGAT T AAGT AA 
G G AGGGT AAGC AGT T G AAG G C GG AT AT T T T C T T T GG AG G AAAT TAT ACG C 
AATTT GAAAGT CAT AAGGCAT T GT T T GAGT CTT ACGT AT CAAAGAATATT 
CAT ACT GT TAT T C C AGAT TAT AT C CAT C CGAGT G AT ACG G C GAC AC C T T A 
T AC TAT AAAT GG G AG T GT CT T GAT T GT AAAT AAc G AAT T AGT TAAGG GAC 
T T AC C AT C AAG AGT TAT G AAGAT T TAT T AC AG C C T T C C T T AAAAGGT AAA 
ATTGCCTTTGCAGATCCGAATACTTCCTcTAGTGCTTTCTCACAACTCAC 
T AAT AT ACT CT T GG C C AAG G G T G G T T AC AC C AAT C C AAAAGC GT G GAAC T 
AT GTT AAAAAGCTACAACAT AAT AT T AAT GCT AT CAAAT CT T CT AGCT CT 
T C AG AAG T T TAT C AAT C AG T T G C AG AAGG AAAAAT GAT TGTGGGGTT GAC 
TTACGAAGACCCTAGTGTCAATTTGCAAAAAAGTGGTGCCAATGTTTCTA 
TTGTATATCCGACAGAAGGGACAGTTTTTGTCCCATCTTCGGTTGCAATT 
AT AAAGAATGCT C CTT CT AT GAAAGAAGC AAAGTT AT TT AT T AAT T T T AT 
GCTTTCTTT AGAT GTTCAAAATGCCTTTGGGC AGT CAACGAGTAACCGAC 
CT AT T CGTAAAGATGC CC AAAC GAGT AAT GGCAT GAAAGCTTTAAAGGAT 
AT T G C T AC T CT T AAAG AAG AT T AT CGC T AT GT C AC T AAG C AT AAGGG C C A 
AATCCTTAAAACCTATAATCGTATTCGTAGAAATGCTGAT 

SEQ ID NO. 6005 

STRAIN 18RS21 

C AGC C T T C T AAAC T AC T T C C AC C AAAAG AAT TAG T T AT T C T AAGT C C AAA 
TAGTCAAGCCATTTTAACAGGAACGATTCCAGCTTTTGAGGAAAAATACG 
GTATAAAAGTTAAGCTTATTCAAGGTGGGACAGGGCAACTAATAGATAGA 
T T AAGT AAGG AG GGT AAG C AGT T G AAG G C G g AT AT TTTCTTTG GAG G AAA 
T TAT ACG C AAT T T G AAAG T CAT AAGG CAT T G T T T GAG T C T T AC G T AT C AA 
AG AAT GT T CAT AC T G T TAT T C C AG AC TAT AT C CAT C C AAGT GAT AC G G C G 
AC AC CT TAT ACT AT AAAT GG G AGT GT CTT GAT T GT AAAT AAC G AAT TAG C 
T AAG GG AC T T AC CAT C AAG AGT TAT G AAG AT T TAT TAG AGC CTT CCT TAA 
AAGGTAAAATTGCCTTTGCAGATCCGAATACTTCCTCTAGTGCTTTCTCA 
C AAC T C AC T AAT AT AC T CT T GG C C AAG G G T G G T T AC AC C AAT C C AAAAGC 
GTGGAACTAT GTT AAAAAGCTACAACAT AAT ATT AATGCT AT CAAAT CT T 
CTAGCTCTTCAGAAGTT T AT C AAT CAGT T GC AGAAGGAAAAAT GAT TGT G 
GGGCTGACTTACGAAGACCCTAGTGTCAATTTGCAAAAAAGTGGTGCCAA 
T GT T T CT AT T G TAT AT C C GAC AG AAGGG AC AGT TTTTGTCC CAT C T T CGG 
T T GC AATT AT AAAG AATGCT CCTTCTATGAAAGAAGC AAAGTT AT TT ATT 
AATTTTATGCTTTCTTTAGATGTTCAAAATGCCTTTGGGCAGTCAACGAG 
TAACCGACCTATTCGTAAAGATGCCCAAACGAGTAATGGCATGAAAGCTT 
T AAAGGAT AT T G C TACT C T T AAAG AAG AT T AT CGC TAT G T C ACT AAG CAT 
AAGGGCCAAAT C CTT AAAAC CT AT AAT CGT AT T CGT AGAAAT GCT GAT 

SEQ ID NO. 6006 

STRAIN M732 

C AG C C T T CT AAACT AC T T C C AC C AAAAG AAT TAG T 

T ATT CT AAGT C CAAAT AGT CAAGCC AT T TT AAC AGGAACGATT C CAGCTT 
TTGAGG AAAAAT ACGGT AT AAAAGT T AAG CT TAT T CAAGGT GGG AC AGGG 
C AAC T AAT AG AT AG AT T AAGT AAG GAG G GT AAG C AGT T G AAGG CG GAT AT 
TTTCTTTGGAGGAAATTATACGCAATTTGAAAGTCATAAGGCATTGTTTG 
AGT CTT ACGT AT C AAAG AAT GT T CAT ACT GTT ATT CC AG ACT AT AT CC AT 
CCGAGTGATACGGCGACACCTTATACTATAAATGGGAGTGTCTTGATTGT 
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AAATAACGAATTAGCTAAGGGACTTACCATCAAGAGTTATGAAGATTTAT 
TACAGCCTTCCTTAAAAGGTAAAATTGCCTTTGCAGATCCGAATACTTCC 
T C T AGT G CT T T CT C AC AACT C ACT AAT AT ACT CT T GG C C AAG GGT GGT T A 
C AC C AAT C C AAAAG C GT GG AAC T AT GT T AAAAAG C T AC AAC AT AAT AT T A 
ATGCTATCAAATCTTCTAGCTCTTCAGAAGTTTATCAATCAGTTGCAGAA 
GGAAAAATGATTGTGGGGTTGACTTACGAAGACCCTAGTGTCAATTTGCA 
AAAAAGTGGTGCCAATGTTTCTATTGTATACCCGACAGAAGGGACAGTTT 
TTGTCCCATCTTCGGTTGCAATTATAAAGAATGCTCCTTCTATGAAAGAA 
GCAAAGTTATTTATTAATTTTATGCTTTCTTTAGATGTTCAAAATGCCTT 
T GG G C AGT C AAC G AGT AAC C G AC C TAT T C GT AAAG AT G C C C AAAC AAGT A 
AT G G CAT G AAAGCT T T AAAG GAT AT C G CT AC T C T T AAAG AAG AT TAT CG C 
TATGTCACTAAGCATAAGAGCCAAATCCTTAAAACCTATAATCGCATTCG 
T AGAAAT G C T GAT 

SEQ ID NO. 6007 

STRAIN COH1 

CAGCCTTCTAAACTACTTCCACCAAAAGAATTAGTT 

ATTCTAAGTCCAAATAGTCAAGCCATTTTAACAGGAACGATTCCAGCTTT 
T GAG G AAAAAT AC GGT AT AAAAG T T AAGC T T AT T C AAG G T G G G AC AG G G C 
AACTAATAGATAGATTAAGTAAGGAGGGTAAGCAGTTGAAGGCGGATATT 
TTCTTTGGAGGAAATTATACGCAATTTGAAAGTCATAAGGCATTGTTTGA 
GT CT T ACGT AT C AAAG AAT GT T CAT ACT GT TAT T C C AG AC TAT AT C CAT C 
CGAGTGATACGGCGACACCTTATACTATAAATGGGAGTGTCTTGATTGTA 
AAT AAC G AAT T AG CT AAG GG AC T T AC CAT C AAG AGT TAT G AAG AT T TAT T 
ACAGCCTTCCTTAAAAGGTAAAATTGCCTTTGCAGATCCGAATACTTCCT 
C T AGT G C T T T C T C AC AAC T C ACT AAT AT ACT C T T GG C C AAGGGT GGT T AC 
AC C AAT C C AAAAG CGT G GAAC T AT GT T AAAAAG CT AC AAC AT AAT AT T AA 
TGCTATCAAATCTTCTAGCTCTTCAGAAGTTTATCAATCAGTTGCAGAAG 
GAAAAATGATTGTGGGGTTGACTTACGAAGACCCTAGTGTCAATTTGCAA 
AAAAGTGGTGCCAATGTTTCTATTGTATACCCGACAGAAGGGACAGTTTT 
TGT CCCAT CTT CGGTT GC AATTAT AAAGAAT GCT C CT T CT ATGAAAGAAG 
CAAAGTTATTTATTAATTTTATGCTTTCTTTAGATGTTCAAAATGCCTTT 
G GG C AGT C AAC G AG T AAC C GAC C T ATT CGT AAAG AT GCCC AAAC AAGT AA 
T G G CAT G AAAG CT T T AAAG GAT AT C G C T ACT C T T AAAGAAG AT TAT C GC T 
AT G T C AC T AAG CAT AAG AG C C AAAT C C T T AAAAC C TAT AAT C GC AT T C G T 
AGAAAT GCT GAT 

SEQ ID NO. 6008 

STRAIN M7 81 

C AG C C T T C T AAAC TACT T C C AC C AAAAG AAT T AGT TAT T 

CTAAGTCCAAATAGTCAAGCCATTTTAACAGGAACGATTCCAGCTTTTGA 

GGAAAAATACGGTATAAAAGTTAAGCTTATTCAAGGTGGGACAGGGCAAC 

TAATAGATAGATTAAGTAAGGAGGGTAAGCAGTTGAAGGCGGATATTTTC 

TTTGGAGGAAATTATACGCAATTTGAAAGTCATAAGGCATTGTTTGAGTC 

TTACGTATCAAAGAATGTTCATACTGTTATTCCAGACTATATCCATCCGA 

GT G AT ACG G C GAC AC C T TAT AC T AT AAAT G GG AGT G T C T T GAT T G T AAAT 

AAC GAAT T AG C T AAGG G AC T T AC CAT C AAG AG T TAT G AAG AT T TAT T AC A 

GCCTTCCTTAAAAGGTAAAATTGCCTTTGCAGATCCGAATACTTCCTCTA 

GTGCTTTCTCACAACTCACTAATATACTCTTGGCCAAGGGTGGTTACACC 

AAT CCAAAAGCGT GGAACT AT GT T AAAAAGCT ACAACAT AAT ATT AAT G C 

TATCAAATCTTCTAGCTCTTCAGAAGTTTATCAATCAGTTGCAGAAGGAA 

AAATGATTGTGGGGTTGACTTACGAAGACCCTAGTGTCAATTTGCAAAAA 

AGT GGT GC C AAT G T T T C T AT T GT AT AC C C GAC AG AAG G GAC AGT T T T T GT 

CC C AT CTT CGGT TGC AATTAT AAAG AATGCTCCTTCT AT G AAAG AAGCAA 

AGTTATTTATTAATTTTATGCTTTCTTTAGATGTTCAAAATGCCTTTGGG 

CAGTCAACGAGTAACCGACCTATTCGTAAAGATGCCCAAACAAGTAATGG 

CAT G AAAG CTT T AAAGG AT AT CGCTACTCT T AAAG AAG AT TAT C G C T AT G 

TCACTAAGCATAAGAGCCAAATCCTTAAAACCTATAATCGCATTCGTAGA 

AATGCTGAT 

SEQ ID NO. 6009 

STRAIN CJB110 

C AGCCTTTTAAACTACTTCCACC AAAAG AATT AGT T ATT CT 
AAGTCCAAATAGTCAAGCCATTTTAACAGGAACGATTCCAGCTTTTGAGg 
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AAAAAT ACG GT AT AAAAGT T AAG C T TAT T C AAG GT GG GAC AGGG C AAC T A 
AT AGAT AGAT T AAGT AAG G AGG GT AAG C AGT T GAAG G CG G AT AT T T T CT T 
TGGAGGAAATTATACGCAATTTGAAAGTCATAAGGCATTGTTTGAGTCTT 
ACG TAT C AAAGAAT GT T CAT AC T GT T AT T C C AGAC TAT AT C CAT C C AAGT 
GATACGGCGACACCTTATACTATAAATGGGAGTGTCTTGATTGTAAATAA 
C GAAT T AG C T AAG GGACT T AC CAT C AAGAGT T AT GAAG AT T T AT T AC AG C 
CTTCCTTAAAAGGTAAAATTGCCTTTGCAGATCCGAATACTTCCTCTAGT 
GCTTTCTCACAACTCACTAATATACTCTTGGCCAAGGGTGGTTACACCAA 
T C C AAAAGCG T GG AAC TAT GT T AAAAAG CT AC AAC AT AAT AT T AAT G C T A 
T C AAAT C T T CT AG CT CT T C AGAAGT T T AT C AAT C AGT T GC AGAAGGAAAA 
AT GAT TGTGGGGCT GAC T T AC GAAG AC C C T AGT GT C AAT T T G C AAAAAAG 
TGGTGCCAATGTTTCTATTGTATATCCGACAGAAGGGACAGTTTTTGTCC 
CAT CTT CGGT TGCAATT AT AAAGAAT GCT CCTT CTATGAAAGAAGCAAAG 
TTATTTATTAATTTTATGCTTTCTTTAGATGTTCAAAATGCCTTTGGGCA 
GT C AACG AGT AAC C GAC C TAT T C GT AAAG AT G C C C AAAC GAGT AAT GGC A 
T GAAAGC T T T AAAG GAT AT T G C T ACT C T T AAAG AAGAT TAT C G C TAT GT C 
ACT AAG C AT AAGG G C C AAAT C C T T AAAAC C TAT AAT C GT AT T C G TAG AAA 
TGCTGAT 

SEQ ID NO. 6010 

STRAIN 1169NT 

ATAGTCAAGCCATTTTAACAGGAACGATTCCAGCTTTTGAGGAAAAATAC 
GGTATAAAAGTTAAGCTTATTCAAGGTGGGACAGGGCAACTAATAGATAG 
AT T AAGT AAG G AGG GT AAG CAT T T GAAG G CGG AT AT T T T C T t T GG AG GAA 
AT TAT AC G C AAT T T G AAAGT CAT AAG G CAT T G TT T GAGT C T T AC G TAT C A 
AAGAATGTTCATACTGTTATTCCAGACTATATCCATCCAAGTGATACGGC 
GAC AC C T TAT AC TAT AAAT G GG AG T GT C T T GAT T G T AAAT AAC GAAT TAG 
C T AAG G G AC T T AC CAT C AAG AGT TAT GAAG AT T TAT T AC AG CCTTCCTTA 
AAAGGTAAAATTGCCTTTGCAGATCCGAATACTTCCTCTAGTGCTTTCTC 
AC AAC T C AC C AAT AT AC T C T T GG C AAAGG GT G G T T AC AC C AAT C C AAAAG 
C GT GG AACT AT GT T AAAAAG C T AC AAC AT AAT AT T AAT G C TAT C AAAT C T 
T C T AG CT C T T C AG AAGT T TAT C AAT C AGT T G C AG AAGGAAAAAT GAT T G T 
GGGGT T GAC T TAG G AAGAC C C T AGT GT C AAT T t G C AAAAAAGT GGT G C C A 
ATGTTTCTATTGTATATCCGACAGAAGGGACAGTTTTTGTCCCATCTTCG 
G T T GC AAT TAT AAAG AAT GCTCCTTC TAT GAAAG AAG C AAAGT TAT T TAT 
TAATTTTATGCTTT CTT T AGAT GTTCAAAATGCCTTTGGGCAGTCAACGA 
GT AAC C GAC CT AT T C GT AAAG AT GC C C AAAC GAGT AAT GG CAT G AAAG CT 
T T AAAGG AT AT T G C T ACT C T T AAAG AAG AT TAT C G CT AT GT C AC T AAG C A 
T AAGG G C C AAAT C C T T AAAAC C TAT AAT CGT AT T CGT AG AAAT G CT GAT 

SEQ ID NO. 6011 

STRAIN JM91130013 

CAGCCTTCTAAACTACTTCCACCAAAAGAATTAGT 

TAT T C T AAGT C C AAAT AG T C AAG C CAT T T T AAC AG G AACG AT T C C AGC T T 
T T GAG G AAAAAT AC GGT AT AAAAGT T AAG C T TAT T C AAG GT G GGAC AGGG 
CAACTAATAGATAGATTAAGTAAGGAGGGTAAGCAGTTGAAGGCGGATGT 
TTTCTTTG G AGG AAAT TAT AC GC AAT T T G AAAGT CAT AAG G CAT T GT TT G 
AGT C T T AC G TAT C AAAG AAT G T T CAT AC T G T TAT T C C AG AC TAT AT C CAT 
C C GAGT GAT AC GGC GAC AC C T TAT ACT AT AAAT GGG AGT G T C T T GAT T G T 
AAAT AAC GAAT TAG C T AAG GGACT T AC CAT C AAGAG T TAT GAAG AT T TAT 
TACAGCCTTCCTTAAAAGGTAAAATTGCCTTTGCAGATCCGAATACTTCC 
TCTAGTGCTTTCTCACAACTCACCAATATACTCTTGGCAAAGGGTGGTTA 
C AC C AAT C C AAAAG CGT GGAAC T AT GT T AAAAAG C T AC AAC AT AAT AT T A 
AT G C TAT C AAAT CTT CT AG C T C T T C AG AAGT T TAT C AAT C AGT T GC AG AA 
GGCAAAATGATTGTGGGGCTGACTTACGAAGACCCTAGTGTCAATTTGCA 
AAAAAGTGGTGCCAATGTTTCTATTGTGTATCCGACAGAAGGGACAGTTT 
TTGTCCCATCTTCGGTTGCAATTATAAAGAATGCTCCTTCTATGAAAGAA 
GCAAAGTTATTTATTAATTTTATGCTTTCTTTAGATGTTCAAAATGCCTT 
T G GG C AGT C AAC GAGT AAC C GAC C T AT T C GT AAAG AT G C C C AAAC GAGT A 
AT G G CAT GAAAG C T TT AAAG GAT AT T G CT AC T C T T AAAG AAG AT TAT C G C 
TAT GT C AC T AAG CAT AAG GGC C AAAT C C T T AAAAC CT AT AAT C GT AT T C G 
TAGAAATGCTGAT 

SEQ ID NO. 6012 
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STRAIN 2 603 frame: 1 

MKEKQSKRLI YILLWS 1 1 FI SVFTYS I SQPSKLLPPKELVILS PNSQAILTGTI PAFEE 
KYGIKVKLIQGGTGQLIDRLSKEGKQLKADIFFGGNYTQFESHKALFESYVSKNVHTVIP 
DYIHPSDTATPYTINGSVLIVNNELAKGLTIKSYEDLLQPSLKGKIAFADPNTSSSAFSQ 
LTNILLAKGGYTNPKAWNYVKKLQHNINAIKSSSSSEVYQSVAEGKMIVGLTYEDPSVNL 
QKSGANVSIVYPTEGTVFVPSSVAIIKNAPSMKEAKLFINFMLSLDVQNAFGQSTSNRPI 
RKDAQT SNGMKALKD I ATLKE D YRYVTKHKGQI LKT YNRI RRNAD 

SEQ ID NO. 6013 

STRAIN 090 frame: 1 

QPSKLLPPKELVILS PNSQAILTGTI PAFEEKYGIKVKLIQGGTGQLIDRLSKEGKQLKA 
DIFFGGNYTQFESHKALFESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 
TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINA 
IKS S S S SEVYQS VAEGKMI VGLT YE DPS VNLQKSGANVS I VYPTEGTVFVPS SVAIIKNA 
PSMKEAKLFINFMLSLDVQNAFGQSTSNRPIRKDAQTSNGMKALKDIATLKEDYRYVTKH 
KGQI LKT YNRI RRNAD 1 

SEQ ID NO. 6014 

STRAIN A909 frame: 1 

QPSKLLPPKELVILSPNSQAILTGTIPAFEEKYGIKVKLIQGGTGQLIDRLSKEGKQLKA 
DIFFGGNYTQFESHKALFESYVSKNIHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 
TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNWKKLQHNINA 
IKS S S S SEVYQS VAEGKMI VGLT YE D PS VNLQKSGANVS I VYPTEGTVFVPS SVAIIKNA 
PSMKEAKLFINFMLSLDVQNAFGQSTSNRPIRKDAQTSNGMKALKDIATLKEDYRYVTKH 
KGQI LKT YNRI RRNAD 

SEQ ID NO. 6015 

STRAIN H3 6B frame: 2 

KLLPPKELVILS PNSQAILTGTI PAFEEKYGIKVKLIQGGTGQLIDRLSKEGKQLKADIF 
FGGNYTQFESHKALFESYVSKNIHTVIPDYIHPSDTATPYTINGSVLIVNNELVKGLTIK 
SYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINAIKS 
S S S SEVYQS VAEGKMI VGLT YEDPS VNLQKSGANVS IVYPTEGTVFVPS SVAI IKNAPSM 
KEAKLFINFMLSLDVQNAFGQSTSNRPIRKDAQTSNGMKALKDIATLKEDYRYVTKHKGQ 
I LKT YNRI RRNAD 

SEQ ID NO. 6016 

STRAIN 18RS21 frame: 1 

QPSKLLPPKELVILS PNSQAI LTGT I PAFEEKYGIKVKLIQGGTGQLI DRLSKEGKQLKA 
DIFFGGNYTQFESHKALFESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 
TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINA 
IKS S S S SEVYQSVAEGKMIVGLTYE DPS VNLQKSGANVS IVYPTEGTVFVPS SVAI IKNA 
PSMKEAKLFINFMLSLDVQNAFGQSTSNRPIRKDAQTSNGMKALKDIATLKEDYRYVTKH 
KGQI LKT YNRI RRNAD 

SEQ ID NO. 6017 

STRAIN M732 frame: 1 

QPSKLLPPKELVILS PNSQAILTGT I PAFEEKYGIKVKLIQGGTGQLIDRLSKEGKQLKA 
DIFFGGNYTQFESHKALFESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 
TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINA 
IKS SSSSEVYQSVAEGKMIVGLTYEDPSVNLQKSGANVS IVYPTEGTVFVPS SVAIIKNA 
P SMKE AKL FIN FML S LD VQNAFGQS T SNRPIRKDAQT SNGMKALKD I AT LKE DYRYVTKH. 
KS Q I LKT YNR I RRNAD 

SEQ ID NO. 6018 

STRAIN COH1 frame: 1 

QPSKLLPPKELVILS PNSQAILTGT I PAFEEKYGIKVKLIQGGTGQLI DRLSKEGKQLKA 
DIFFGGNYTQFESHKALFESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 
TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINA 
IKS S S S SEVYQS VAEGKMI VGLTYE DPSVNLQKSGANVS I VYPTEGTVFVPS SVAI IKNA 
PSMKEAKL FIN FMLSLDVQNAFGQSTSNRPIRKDAQTSNGMKALKDI ATLKE DYRYVTKH 
KSQILKTYNRIRRNAD 

SEQ ID NO. 6019 

STRAIN M7 81 frame: 1 
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QPSKLLPPKELVILSPNSQAILTGTIPAFEEKYGIKVKLIQGGTGQLIDRLSKEGKQLKA 
DIFFGGNYTQFESHKALFESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 
TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINA 
IKSSSSSEVYQSVAEGKMIVGLTYEDPSWLQKSGANVSIVYPTEGTVFVPSSVAIIKNA 
PSMKEAKLFINFMLSLDVQNAFGQSTSNRPIRKDAQTSNGMKALKDIATLKEDYRYVTKH 
KSQILKTYNRIRRNAD 

SEQ ID NO. 6020 

STRAIN CJB110 frame: 1 

QPFKLLPPKELVILSPNSQAILTGTIPAFEEKYGIPCVKLIQGGTGQLIDRLSKEGKQLKA 

DIFFGGNYTQFESHKALFESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 

TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPFCAWNYVKKLQHNINA . 

IKSSSSSEVYQSVAEGKMIVGLTYEDPSVNLQKSGANVSIVYPTEGTVFVPSSVAIIKNA 

PSMKEAKLFINFMLSLDVQNAFGQSTSNRPIRKDAQTSNGMKALKDIATLKEDYRYVTKH 

KGQILKTYNRIRRNAD 

SEQ ID NO. 6021 

STRAIN 1169NT frame: 3 

SQAILTGTIPAFEEKYGIKVKLIQGGTGQLIDRLSKEGKHLKADIFFGGNYTQFESHKAL 
FESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGLTIKSYEDLLQPSLKGKI 
AFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINAIKSSSSSEVYQSVAEGK 
MIVGLTYEDPSVNLQKSGANVSIVYPTEGTVFVPSSVAIIKNAPSMKEAKLFINFMLSLD 
VQNAFGQS T SNRP IRKDAQT SNGMKALKD I AT LKE D YRYVTKHKGQI LKT YNRIRRNAD 

SEQ ID NO. 6022 

STRAIN JM91130013 frame: 1 

QPSKLLPPKELVILSPNSQAILTGTIPAFEEKYGIKVKLIQGGTGQLIDRLSKEGKQLKA 
DVFFGGNYTQFESHKALFESYVSKNVHTVIPDYIHPSDTATPYTINGSVLIVNNELAKGL 
TIKSYEDLLQPSLKGKIAFADPNTSSSAFSQLTNILLAKGGYTNPKAWNYVKKLQHNINA 
IKS S S S SEVYQS VAEGKMI VGLT YEDPSVNLQKSGANVS I VYPTEGTVFVPS S VAI IKNA 
PSMKEAKLFINFMLSLDVQNAFGQSTSNRPIRKDAQTSNGMKALKDIATLKEDYRYVTKH 
KGQILKTYNRIRRNAD 

SEQ ID NO. 6101 
STRAIN 2603 

ATGGTAAAAGTTAGTGTAAGTTCTGTAGGAACTCAAGCATCAACAGTAGCTATTTCTATG 
T T T AGT CG T GT AT C GG C T T T AAAT GAT G C AAT AAC AAAAC T AT CAT CT T T T G C AG AGG CT 
GCAACTCTTCAAGGGACTGCTTATTCAAATGCAAAAAGCTATGCTACTGGAACGTTAACT 
CCGATGCTTCAAGGAATGATTCTTTTCTCTGAAACATTGAGTGAGAAATGTACAGAATTA 
CAAACCTTATATGTCTCAATTTGTGGTGATGAGGATTTAGACTCTGTCGTTTTAGAATCA ' 
AAAT T AGC AAG T G AT AGGG C AT CAT T AAAG AT T G C T G AAG C ACT T T TAG AG CAT C T T AAC 
GAT GAT C C AG AAC C T T C C AAAT CT G C C AT AAGT T C T AC AAAAAGT AAT AT T AAAAAAT T A 
AAAAAACGTATAAAATCTAATCAAAA.GAAATTAGACAACCTTAATGAATTTAACGCCCAT 
T C AG C AAC AGT AT T T G C GG AC AT T T CT AAT G C AC AGT C AAC T GT T AAC C AAG C ACT AG C G 
G CT GT T T C AAC AGG AT T T T CT G GAT AT AAT AGT AAAAC C GGAG CT T T T G G AAAAC C AAC A 
T C C G GAC AG AT GG AAT G G AC AAAG AC AG T T AAGAAG AAT T G G AAAG AG C GAG AAG AC G C C 
AAAG C T G AAGAACT G AAAAG T AAAAAG G C T G AAG AAAG T AAG AAAG CT T C AAAAAT T G AA 
AAT AC T AC T AAAAAAAG T AAT GT T T C AGT T GAT AAAAAG AAAT T AAT AAAAG C G G C T AAT 
GAAGCGTATAAATTAGGAGAAATTAAAAAAGATACCTATGAATCAATTATCAGTGGTTTA 
AGT AATGCATCGGCTGCCTTACTTAAAGAGGTAGCT AAAT CAAAATTGACT GAC ACAGCT 
CGGCTATTGATG 

SEQ ID NO. 6102 

STRAIN 0 90 

T T AAAT GAT GC AAT AAC AAAAC TAT CAT C T T T T G C AG AGG C T 
GCAACTCTTCAAGGGACTGCTT ATT C AAAT GC AAAAAG CTATGCT ACT GG 
AACGTTAACTCCGATGCTTCAAGGAATGATTCTTTTCTCTGAAACATTGA 
G T G AG AAAT GT AC AG AAT T AC AAAC C T TAT AT GT C T C AAT T T GT G G T GAT 
G AGG AT T TAG AC TCTGTCGTT T TAG AAT C AAAAT T AG C AAGT GAT AGG G C 
AT C AT T AAAGAT T G C T G AAG C AC T T T TAG AG CAT C T T AAC GAT GAT C C AG 
AACCTT C C AAAT CTG CC AT AAGTTCT AC AAAAAGT AAT ATT AAAAAAT T A 
AAAAAAC G TAT AAAAT C T AAT C AAAAG AAAT TAG AC AAC C T T AAT G AAT T 
T AACG C C CAT T C AG C AAC AGT AT T T G C G GAC AT T T C T AAT G C AC AGT C AA 
CTGTTAACCAAGCACTAGCGGCTGTTTCAACAGGATTTTCTGGATATAAT 
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AGT AAAAC CG GAG C T T T T G G AAAACC AAC AT C C GG AC AG AT GG AAT G G AC 
AAAGACAGTTAAGAAGAATTGGAAAGAGCGAGAAGACGCCAAAGCTGAAG 
AAC T GAAAAGT AAAAAGG CT G AAG AAAGT AAG AAAG C T T C AAAAAT T GAA 
AATACTACTAAAAAAAGTAATGTTTCAGTTGATAAAAAGAAATTAATAAA 
AGCGGCTAATGAAGCGTATAAATTAGGAGAAATTAAAAAAGATACCTATG 
AATCAATTATCAGTGGTTTAAGTAATGCATCGGCTGCCTTACTTAAAGAG 
GT AG C T AAAT C AAAAT T GACT GAC AC AG CT CG G C TAT T G ATG 

SEQ ID NO. 6103 

STRAIN 18RS21 

T T AAAT GAT GC AAT AAC AAAAC TAT CAT C T T T T G C AG AG G C 
TGCAACTCTTCAAGGGACTGCTTATTCAAATGCAAAAAGCTATGCTACTG 
GAACGTTAACTCCGATGCTTCAAGGAATGATTCTTTTCTCTGAAACATTG 
AG T GAG AAAT GT AC AG AAT T AC AAAC C T T AT AT GT C T C AAT T T GT G GT G A 
T G AGG ATT T AGAC TCTGTCGTTT TAG AAT C AAAAT TAG C AAGT GAT AG GG 
CAT CAT T AAAGAT T G C T G AAG C AC T T T TAG AG CAT C T T AAC GAT GAT C C A 
G AAC C T T C C AAAT CT G C CAT AAGT T CT AC AAAAAGT AAT AT T AAAAAAT T 
AAAAAAAC G TAT AAAAT C T AAT C AAAAG AAAT TAG AC AAC C T T AAT G AAT 
T T AAC GC C CAT T C AG C AAC AGT AT T T G CGG AC AT T T CT AAT G C AC AGT C A 
AC T GT T AAC C AAGC ACT AGCGG C T GT T T C AAC AGGAT T T T C T G G AT AT AA 
TAGTAAAACCGGAGCTTTTGGAAAACCAACATCCGGACAGATGGAATGGA 
C AAAG AC AG T T AAG AAGAAT T G G AAAG AG C GAG AAG AC G C C AAAGC T GAA 
G AACT GAAAAGT AAAAAGGCTG AAG AAAGT AAG AAAGCT T C AAAAAT T GA 
AAAT ACT ACT AAAAAAAGTAATGT TT C AGT TGAT AAAAAGAAAT T AAT AA 
AAG C G G C T AAT GAAG C GT AT AAAT T AGG AG AAAT T AAAAAAG AT AC C T AT 
G AAT C AAT TAT C AGT G GT T T AAG T AAT G CAT CGGCTGCC T T AC T T AAAGA 
G G TAG CT AAAT C AAAAT T GAC T G AC AC AGC T CGG C T AT T GAT G 

SEQ ID NO. 6104 

STRAIN 2 603 frame: 1 

MVKVSVSSVGTQASTVAISMFSRVSALNDAITKLSSFAEAATLQGTAYSNAKSYATGTLT 
PMLQGMILFSETLSEKCTELQTLYVSICGDEDLDSVVLESKLASDRASLKIAEALLEHLN 
DDPEPSKSAISSTKSNIKKLKKRIKSNQKKLDNLNEFNAHSATVFADISNAQSTVNQALA 
AVSTGFSGYNSKTGAFGKPTSGQMEWTKTVKKNWKEREDAKAEELKSKKAEESKBCASKIE 
NTTKKSNVSVDKKKLIKAANEAYKLGEIKKDTYESIISGLSNASAALLKEVAKSKLTDTA 
RLLM 

SEQ ID NO. 6105 

STRAIN 090 frame: 1 

LNDAITKLSSFAEAATLQGTAYSNAKSYATGTLTPMLQGMILFSETLSEKCTELQTLYVS 
ICGDEDLDSWLESKLASDRASLKIAEALLEHLNDDPEPSKSAISSTKSNIKKLKKRIKS 
NQKKLDNLNEFNAHSATVFADISNAQSTVNQALAAVSTGFSGYNSKTGAFGKPTSGQMEW 
TKTVKKNWKEREDAKAEELKSKKAEESKKASKIENTTKKSNVSVDKKKLIKAANEAYKLG 
EIKKDTYESIISGLSNASAALLKEVAKSKLTDTARLLM 

SEQ ID NO. 6106 

STRAIN 18RS21 frame: 1 

LNDAITKLSSFAEAATLQGTAYSNAKSYATGTLTPMLQGMILFSETLSEKCTELQTLYVS 
ICGDEDLDSVVLESKLASDRASLKIAEALLEHLNDDPEPSKSAISSTKSNIKKLKKRIKS 
NQKKLDNLNEFNAHSATVFADISNAQSTVNQALAAVSTGFSGYNSKTGAFGKPTSGQMEW 
TKTVKPCNWKEREDAKAEELKSKKAEESKKASKIENTTKKSNVSVDKKKLIKAANEAYKLG 
EIKKDTYES I ISGLSNASAALLKEVAKSKLTDTARLLM 

SEQ ID NO. 6201 
STRAIN 2603 

ATGATTTTAAAAATTTGTCGTGCAGCATATAGTTTACAATGGGGAGGTGTTTACCAATTA 
GCTTTGCT GG AT TAT C C T C G AAT T AAGG C GT T T G AAT T G G AAAG GAT AGG AG C T T T CAT A 
G C T T AC GAG AAAC AAT AT AAAAG AAAAAC T G AGAT AC AAT G T GAC GAT AAAC AT C T C C T C 
G C AAAAAT T GT T C AT TT T TT AAAAT AC AAT AGT TT TACTTTTCCCT AT ATT CCC AAAT AT 
AGAGAAGCGGCAGCTACTTTTAATGAGGATGGTATTAGTTTAACTTCTGATTTTTTAAGC 
CAT ACAT GT ACGAT TGAAACTGC AAAACT AATT TTT AAAGAAGGT AAAAT CT T AT CAGC A 
GTTAAAGCCTTTAATAAGCCTGCTGAAGTACTGGTAAAAGATAAGAGGAATGCTGCTGGA 
GAC C C T AAAGAT TACT T T GACT AT G T GAT G T T G AAC T G GT C AAAT AC C AAT T C T G GT T AT 
CGTTTAGTAATGGAAAGATTGTTAGGCAAAGCACCATCTGAACAGGAGTTAACAGTAGGT 
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TTTAAGCCAGGGGTCAGTTTTCATTTTACTTATCAAGATATCATCAATCATCCTGATTCT 
AT T T T T GAT G GT TAT CAT C CT G C T AAAAT T AAAAAT C AG CT T T C T T TAG C AGAAC AT T T A 
GTTGCATGTGTTATCCCAAAACATTATCAAGAAGATTATCAAAGCCTTGTGCCCAATGAC 
TTGAAACACAGGGTTTATTATTTAGATTACTGTAACGAAACACTTTATGAGTGGAATCAA 
AAAGTTTATGATTTTCTTTGTCATTTGGAAAATAAA 

SEQ XD NO. 6202 
STRAIN 090 

TGGATTATCCTCTAATTAAGGCGTTTGAATTGGAAAGGATAGGAGCTTTC 
AT AG C T T AC GAG AAAC AAT AT AAAAGAAAAAT T G AG AT AC AAT GT GACG A 
TAAACATCTCCTCACAAAAATTGTTCATTTTTTAAAATACAATAGTTTTA 
C T T T T C C CT AT AT T C C C AAAT AT AG AG AAG CGG C AG CT ACT T T T AAT GAG 
GAT GGT AT T AGT T T AAC T T C T GAT T T T T T AAG C CAT AC AT G T AC GAT T GA 
AACT GC AAAAC T AAT T T T TAAAG AAG GT AAAAT C T TAT C AG C AGTT AAAG 
CCTTTAATAAGCCTGCTGAAGTACTGGTAAATGATAAGAGGAATGCTGCT 
GGAGACCCTAAAGATTACTTTGACTATGTGATGTTGAACTGGTCAAATAC 
C AAT T CT GGT TAT C GT T TAG T AAT G G AAAG AT T GT TAG G C AAAG C AC CAT 
C T G AAC AGG AGT T AAC AGT AG C T T T T AAGC C AG G G GT C AG C T T T C ATT T T 
AAT T a T C AAGAT AT CAT C AAT CAT C CT G AT T C T AT T T T T GAT GGTT AT C A 
TCCTGCTAAAATT AAAAAT CAACTTTCTTTAGCAGAACATTT AGTT G CAT 
GT G T T AT C C C AAAAC AT TAT C AAGAAG AT T AT C AAAG CCTTGTGC C T AAT 
G AC T T G AAACAC AG AGT T T AT T AT T TAG AT TACT GT AAC G AAAC AC T T T A 
T G AGT G G AAT C AAAAAGT T TAT GAT TTTCTTTGT CAT T T GGAAAAT AAA 

SEQ ID NO. 6203 
STRAIN A90 9 

TTGCTGGATTATCCTCGAATTAAGGCGTTTGAATTGGAAAGGATA 

GG AG C T T T CAT AG C T T AC GAGAAAC AAT AT AAAAGAAAAAT T GAG AT AC A 

AT GT G AC GAT AAAC AT C T C C T C AC AAAAAT T G T T CAT T T T T T AAAAT AC A 

ATAGTTTTACTTTTCCCTATATTCCCAAATATAGAGAAGCGGCAGCTACT 

T T T AAT GAG GAT G GT AT T AGT T T AAC T T CT G AT T T T T T AAG C CAT AC AT G 

T ACG AT T G AAAC T G C AAAACT AAT T T T TAAAG AAGGT AAAAT C T T AT C AG 

C AGT TAAAG C CT T T AAT AAG C C T G C T G AAGT ACT GGT AAAT GAT AAG AG G 

AATGCTGCTGGAGACCCTAAAGATTACTTTGACTATGTGATGTTGAACTG 

GTCAAATACCAATTCTGGTTATCGTTTAGTAATGGAAAGATTGTTAGGCA 

AAG C AC CAT CT G AAC AG G AGT T AAC AGT AG C T T T T AAG C C AGG G GT C AG C 

TTTCATTTTAATTATCAAGATATCATCAATCATCCTGATTCTATTTTTGA 

T GG T T AT CAT C C T G CT AAAAT T AAAAAT C AAC T T T CT T TAG C AG AAC AT T 

TAGTTGCATGTGTTATCCCAAAACATTATCAAGAAGATTATCAAAGCCTT 

G T G C CT AAT GAC T T G AAAC AC AG AGT T T AT TAT T TAG AT TACT G T AAC G A 

AACACTTTATGAGTGGAATCAAAAAGTTTATGATTTTCTTTGTCATTTGG 

AAAATAAA 

SEQ ID NO. 6204 
STRAIN H36B 

T T AAGG C GT T T GAAT T G GAAAG GAT AG GAG C T T T CAT AG C T TAG GAG AAA 
C AAT AT AAAAGAAAAAT T GAG AT AC AAT GT GACGAT AAAC AT CT CCT C AC 
AAAAATTGTTCATTTTTTAAAATACAATAGTTTTACTTTTCCCTATATTC 
C C AAAT AT AGAG AAG CG G C AG C T AC T T T T AAT GAG GAT G GT AT TAG T T T A 
ACTTCTGATTTTTTAAGCCATACATGTACGATTGAAACTGCAAAACTAAT 
T T T TAAAG AAG GT AAAAT C T TAT C AG C AGT TAAAG C C T T T AAT AAG C CT G 
CTGAAGTACTGGTAAATGATAAGAGGAATGCTGCTGGAGACCCTAAAGAT 
T ACTT TGACT AT GT GAT GTTGAACT GGT C AAAT ACC AAT TCT GGTT ATCG 
T T T AG T AAT GG AAAG AT T G T T AG GC AAAGC AC CAT C T G AAC AG GAG T T AA 
CAGTAGCTTTTAAGCCAGGGGTCAGCTTTCATTTTAATTATCAAGATATC 
ATCAATCATCCTGATTCTATTTTTGATGGTTATCATCCTGCTAAAATTAA 
AAATCAACTTTCTTTAGCAGAACATTTAGTTGCATGTGTTATCCCAAAAC 
ATT AT CAAGAAGATT AT C AAAGC CT TGT GC CT AAT GACT TGAAAC ACAGA 
GTTTATTATTTAGATTACTGTAACGAAACACTTTATGAGTGGAATCAAAA 
AGTTTATGATTTTCTTTGTCATTTGGAAAATAAA 

SEQ ID NO. 6205 
STRAIN 18RS21 

TTGCTGGATTATCCTCGAATTAAGGCGTT 
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TGAATTGGAAAGGATAGGAGCTTTCATAGCTTACGAGAAACAATATAAAA 
G AAAAACT G AG AT AC AAT GT G AC G AT AAAC AT CT C C T C G C AAAAAT T GT T 
CAT T T T T T AAAAT AC AAT AGT T T T AC TT T T C C CT AT AT T C CC AAAT AT AG 
AGAAGCGGCAGCTACTTTTAATGAGGATGGTATTAGTTTAACTTCTGATT 
TTTTAAGCCATACATGTACGATTGAAACTGCAAAACTAATTTTTAAAGAA 
GGT AAAAT CTT AT CAGCAGTTAAAGCCTTTAATAAGCCTGCTGAAGTACT 
GGT AAAAG AT AAGAGG AAT GC T G CT G GAG AC C C T AAAGAT T ACT T T GAC T 
ATGTGATGTTGAACTGGTCAAATACCAATTCTGGTTATCGTTTAGTAATG 
G AAAGAT T GT T AG GC AAAG C AC CAT C T GAAC AGGAGT T AAC AGT AG GT T T 
T AAGCCAGGGGTCAGTTTTCATTT TACT TAT CAAGAT AT CATC AAT CATC 
CTGATTCTATTTTTGATGGTTATCATCCTGCTAAAATTAAAAATCAGCTT 
T C T T TAG C AG AAC AT T T AG T T G CAT G T G T TAT C C C AAAAC AT TAT C AAG A 
AG AT TAT C AAAG CCTTGTGC C C AAT GAC T T G AAAC AC AGGGT T TAT TAT T 
TAG AT T AC T GT AAC G AAAC ACT T TAT GAG T GG AAT C AAAAAGT T T AT GAT 
TTTCTTTGTCATTTGGAAAATAAA 

SEQ ID NO. 6206 
STRAIN M732 

TTGCTGGATTATCCTCGAATTAAGGCGTT 

TG AAT T GG AAAGGAT AGG AGCT T TC AT AG CT T AC GAGAAAC AAT AT AAAA 
GAAAAACTGAGATACAATGTGACGATAAACATCTCCTCGCAAAAATTGTT 
CAT T T T T T AAAAT AC AAT AGT T TT ACT T T T C C CT AT AT T C CC AAAT AT AG 
AG AAG CGG C AG C T AC T T T T AAT GAG GAT GGT AT T AGT T T AAC T T CT G AT T 
TTTTAAGCCATACATGTACGATTGAAACTGCAAAACTAATTTTTAAAGAA 
GGT AAAAT C T TAT C AG C AGT T AAAG C C T TT AAT AAG C CT G C T G AAGT ACT 
G GT AAAAG AT AAG AGG AAT G C T G C T GG AGAC C C T AAAG AT T AC T T T GAC T 
ATGTGATGTTGAACTGGTCAAATACCAATTCTGGTTATCGTTTAGTAATG 
GAAAG AT T GT T AG GC AAAG C AC CAT C T GAAC AG GAG T T AAC AGT AGGT T T 
T AAG C C AGGGGT C AG T T T T CAT T T TAG T TAT CAAGAT AT CAT C AAT CAT C 
C T GAT T CT AT T T T T GAT GGT TAT CAT C C T G C T AAAAT T AAAAAT C AG CTT 
T CT T TAG C AG AAC AT T T AGT T G CAT GT G T TAT C C C AAAAC AT TAT C AAG A 
AG AT TAT C AAAG C CT T G T G CC C AAT GAC T T G AAAC AC AG G GT T TAT TAT T 
TAG AT TACT GT AAC G AAAC AC T T TAT G AGT G G AAT C AAAAAGT T T AT GAT 
TTTCTTTGnCATTTGGAAAATAAA 

SEQ ID NO. 6207 
STRAIN COH1 
TTGCTGGAT 

TAT C C T C G AAT T AAG G C GT T T G AAT T GGAAAGGAT AG G AG CT T T CAT AG C 
TT AC GAGAAAC AAT AT AAAAG AAAAACT GAGAT AC AAT GT GACGAT AAAC 
ATCTCCTCGCAAAAATTGTTCATTTTTTAAAATACAATAGTTTTACTTTT 
CCCTATATTCCCAAATATAGAGAAGCGGCAGCTACTTTTAATGAGGATGG 
TATTAGTTTAACTTCTGATTTTTTAAGCCATACATGTACGATTGAAACTG 
CAAAACTAATTTTTAAAGAAGGT AAAAT CT TAT CAGCAGTTAAAGCCTTT 
AATAAGCCTGCTGAAGTACTGGTAAAAGATAAGAGGAATGCTGCTGGAGA 
C C C T AAAG AT T AC T T T GAC TAT GT GAT GT T GAAC T GGT C AAAT AC C AAT T 
CTGGTTATCGTTTAGTAATGGAAAGATTGTTAGGCAAAGCACCATCTGAA 
CAGGAGTT AACAGT AGGT TTTAAGCC AGGGGT CAGTTTTCATTTT ACT T A 
TCAAGATATCATCAATCATCCTGATTCTATTTTTGATGGTTATCATCCTG 
CT AAAAT T AAAAAT C AG C T T T C T T TAG C AG AAC AT T T AGT T G C AT GT GT T 
AT C C C AAAAC AT TAT C AAG AAG AT TAT C AAAG CCTTGTGCC C AAT G AC T T 
G AAAC AC AGGGT T TAT TAT T TAG AT TACT G T AAC G AAAC AC T T TAT G AGT 
GGAATCAAAAAGTTTATGATTTTCTTTGGCATTTGGAAAATAAA 

SEQ ID NO. 6208 
STRAIN M7 81 

TTGCTGGA 

T T AT C C T C G AAT T AAG G CGT T T G AAT T G GAAAG GAT AG G AG CT T T CAT AG 
CT T ACGAG AAAC AAT AT AAAAG AAAAACT GAGAT AC AAT GT GACGAT AAA 
CATCTCCTCGCAAAAATTGTTCATTTTTTAAAATACAATAGTTTTACTTT 
T C CC T AT AT T C C C AAAT AT AG AG AAG CGG C AG CT AC T T T T AAT GAG GAT G 
G TAT TAG T T T AACT T CT GAT T T T T T AAG C CAT AC AT G T AC G AT T G AAACT 
• G C AAAAC T AAT T T T T AAAG AAGGT AAAAT C T T AT C AG C AGT T AAAG C C TT 
T AAT AAG C CT G CT G AAG TACT G GT AAAAG AT AAG AGG AAT G CT G CT GG AG 
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AC C CT AAAGAT T ACT T T G AC T AT GT G AT G T T G AACT G G T C AAAT AC C AAT 
TCTGGTTATCGTTTAGTAATGGAAAGATTGTTAGGCAAAGCACCATCTGA 
ACAGGAGTTAACAGTAGGTTTTAAGCCAGGGGTCAGTTTTCATTTTACTT 
AT C AAG AT AT CAT C AAT CAT C C T GAT T C TAT T T T T G AT GGT T AT CAT C CT 
G CT AAAAT T AAAAAT C AGC T T T C T T TAG C AG AAC AT TT AGT T G C AT GT GT 
TAT C C C AAAAC AT T AT C AAGAAG AT TAT C AAAG CCTTGTGCC C AAT G ACT 
TGAAACACAGGGTTTATTATTTAGATTACTGTAACGAAACACTTTATGAG 
TGGAATCAAAAAGTTTATGATTTTCTTTGTCATTTGGAAAATAAA 

SEQ ID NO. 6209 
STRAIN CJB110 

TTGCTGGATTATCCTCGjAATTAAGGC 

GTTTGAATTGGAAAGGATAGGAGCTTTCATAGCTTACGAGAAACAATATA 
AAAG AAAAAT T G AGAT AC AAT GT G AC G AT AAAC AT CT C C T C AC AAAAAT T 
GTT CAT TTTTT AAAAT ACAATAGTTTTACTTTTCCCT AT ATT CCCAAATA 
TAGAGAAGCGGCAGCTACTTTTAATGAGGATGGTATTAGTTTAACTTCTG 
AT T T T T T AAG C CAT AC AT G T AC GAT T G AAACT G C AAAAC T AAT T T T T AAA 
GAAGGTAAAATCTTATCAGCAGTTAAAGCCTTTAATAAGCCTGCTGAAGT 
ACTG GT AAAT G AT AAGAGG AAT G C T G C T GG AG AC C C T AAAGAT TACT T T G 
ACTATGTGATGTTGAACTGGTCAAATACCAATTCTGGTTATCGTTTAGTA 
AT GG AAAG AT T GT T AGGC AAAG C AC CAT C T G AAC AGG AGT T AAC AGT AG C 
TTTTAAGCCAGGGGTCAGCTTTCATTTTAATTATCAAGATATCATCAATC 
ATCCTGATTCTATTTTTGATGGTTATCATCCTGCTAAAATTAAAAATCAA 
CTTTCTTTAGCAGAACATTTAGTTGCATGTGTTATCCCAAAACATTATCA 
AG AAG AT TAT C AAAG CCTTGTGC C T AAT G ACT T G AAAC AC AG AGT T TAT T 
AT T TAG AT T ACT GT AAC G AAAC AC T T TAT GAG T G G AAT C AAAAAGT T TAT 
GATT T T CT T T GT CATTT GGAAAAT AAA 

SEQ ID NO. 6210 
STRAIN 1169NT 

AATTAAGGCGTTTGAATTGGAAAGGATAGGAGCTTTCATAGCTTACGAGA 
AAC AAT AT AAAAG AAAAACT G AGAT AC AAT GT G AC G AT AAAC AT CT CC T C 
G C AAAAAT T GT T CAT T T T T T AAAAT AC AAT AGT T T T AC T T T T C C C TAT AT 
T C CC AAAT AT AG AG AAG CG GC AG C T ACT T T T AAT GAG GAT G GT AT T AGT T 
T AAC T T C T GAT T T T T T AAG C CAT AC AT GT AC GAT T G AAAC T G C AAAAC T A 
AT T T T T AAAG AAGGT AAAAT C T TAT C AG C AGT T AAAG C C T T T AAT AAGC C 
T G C T G AAGT AC T GGT AAAT GAT AAG AG G AAT G CT G C T G GAG AC C C T AAAG 
AT T AC T T T G AC T AT GT G AT GTT GAACT GGT C AAAT AC C AAT T C T GGT TAT 
CGTTTAGTAATGGAAAGATTGTTAGGCAAAGCACCATCTGAACAGGAGTT 
AACAGTAGGTTTTAAGCCAGGGGTCAGCTTTCATTTTACTTATCAAGATA 
TCATCAATCATCCTGATTCTATTTTTGATGGTTATCATCCTGCTAAAATT 
AAAAATCAGCTTTCTTTAGCAGAACATTTAGTTGCGTGTGTTATCCCAAA 
AC AT TAT C AAG AAG AT TAT C AAAAT C T T GT G C C C AAT G AC T T G AAAC AC A 
G AGT T TAT TAT T T AGAT T AC T GT AAC G AAAC AC T T T AT G AGT GG AAT C AA 
AAAGTTTATGATTTTCTTTGTCATTTGGAAAATAAA 

SEQ ID NO. 6211 
STRAIN JM9130013 

ATAGGAGCTTTCATAGCTTACGAGAAACAATATAAAAGAAAAATTGAGAT 
ACAATGTGACGATAAACATCTCCTCACAAAAATTGTTCATTTTTTAAAAT 
AC AAT AG T T T T ACT T T T C C C TAT AT T C C C AAAT AT AG AG AAG C G G C AG C T 
AC T T T T AAT GAG G AT GG TAT T AGT T T AAC T T C T GAT T T T T T AAGC CAT AC 
AT GT AC GAT T G AAAC T G C AAAAC T AAT T T T T AAAG AAG G T AAAAT C T TAT 
CAGCAGTTAAAGCCTTTAATAAGCCTGCTGAAGTACTGGTAAATGATAAG 
AG G AAT G CT G C T G GAG AC C CT AAAG AT T AC T T T G AC TAT GT G AT GT T GAA 
CTGGTCAAATACCAATTCTGGTTATCGTTTAGTAATGGAAAGATTGTTAG 
G C AAAGC AC CAT C T G a AC AG GAG T T AAC AG T AG CT T T T AAG C C AG GGGT C 
AG CT T T CAT T T T AAT TAT C AAG AT AT CAT C AAT CAT C C T GAT T C TAT T T T 
TGATGGTTATCATCCTGCTAAAATTAAAAATCAACTTTCTTTAGCAGAAC 
AT T T AGT T G CAT GT GT T AT C C C AAAAC AT TAT C AAG AAG AT T AT C AAAGC 
C T T GT G C C T AAT G AC T T G AAAC AC AG AG T T T AT T AT T T AG AT T ACT GT AA 
CGAAAC ACT TT AT GAGTGGAATC AAAAAGT TTATGATTTTCTTTGTCATT 
T G G AAAAT AAA 
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SEQ ID NO. 6212 

STRAIN 2 603 frame: 1 

MILKICRAAYSLQWGGVYQLALLDYPRIKAFELERIGAFIAYEKQYKRKTEIQCDDKHLL 
AKIVHFLKYNSFTFPYIPKYREAAATFNEDGISLTSDFLSHTCTIETAKLIFKEGKILSA 
VBCAFNKPAEVLVKDKRNAAGDPKDYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELTVG 
FKPGVSFHFTYQDIINHPDSIFDGYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPND 
LKHRVYYLDYCNETLYEWNQKVYDFLCHLENK 

SEQ ID NO. 6213 

STRAIN A909 frame: 1 

LLDYPRIKAFELERIGAFIAYEKQYKRKIEIQCDDKHLLTKIVHFLKYNSFTFPYIPKYR 
EAAATFNEDGISLTSDFLSHTCTIETAKLIFKEGKILSAVKAFNKPAEVLVNDKRNAA.GD 
PKDYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELTVAFKPGVSFHFNYQDIINHPDSI 
FDGYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQK 
VYDFLCHLENK 

SEQ ID NO. 6214 

STRAIN H3 6B frame: 3 

KAFE LERI GAF I AYEKQYKRKI E I QC D DKHL LTK I VH FLK YN S FT FP Y I PK YRE AAA.T FN 
EDGISLTSDFLSHTCTIETAKLIFKEGKILSAVKAFNKPAEVLVNDKRNAAGDPKDYFDY 
VMLNWSNTNSGYRLVMERLLGKAPSEQELTVAFKPGVS FHFNYQDI INHPDS I FDGYHPA 
KIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQKVYDFLCH 
LENK 

SEQ ID NO. 6215 

STRAIN 18RS21 frame: 1 

LLDYPRIKAFELERIGAFIAYEKQYKRKTEIQCDDKHLLAKIVHFLKYNSFTFPYIPKYR 
EAAATFNEDGISLTSDFLSHTCTIETAKLIFKEGKILSAVKAFNKPAEVLVKDKRNAAGD 
PKDYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELTVGFKPGVSFHFTYQDI INHPDS I 
FDGYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQK 
VYDFLCHLENK 

SEQ ID NO. 6216 

STRAIN M7 32 frame: 1 

LLDYPRIKAFELERIGAFIAYEKQYKRKTEIQCDDKHLLAKIVHFLKYNSFTFPYIPKYR 
EAAAT FNEDGI S LT S DFLSHTCT I ETAKL I FKEGKI LS AVKAFNKPAEVLVKDKRNAAGD 
PKDYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELT VGFKPGVS FHFTYQDI INHPDS I 
FDGYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQK 
V YD FLXHLENK 

SEQ ID NO. 6217 

STRAIN COH1 frame: 1 

LLDYPRIKAFELERIGAFIAYEKQYKRKTEIQCDDKHLLAKIVHFLKYNSFTFPYIPKYR 
EAAAT FNEDGISLTS DFLSHTCT IETAKLI FKEGKI LS AVKAFNKPAEVLVKDKRNAAGD 
PKDYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELTVGFKPGVS FHFTYQDI INHPDS I 
FDGYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQK 
VYDFLWHLENK 

SEQ ID NO. 6218 

STRAIN M781 frame: 1 

LLDYPRIKAFELERIGAFIAYEKQYKRKTEIQCDDKHLLAKIVHFLKYNSFTFPYIPKYR 
EAAAT FNEDGISLTS DFLSHTCT I ETAKLI FKEGKI LS AVKAFNKPAEVLVKDKRNAAGD 
PKDYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELTVGFKPGVS FHFTYQDI INHPDS I 
FDGYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQK 
VYDFLCHLENK 

SEQ ID NO. 6219 

STRAIN CJB110 frame: 1 

LLDYPRIKAFELERIGAFIAYEKQYKRKIEIQCDDKHLLTKIVHFLKYNSFTFPYIPKYR 
EAAAT FNE DGISLTSDFLS HT CT I ETAKL I FKE GK I L S AVKAFNK P AE VL VN DKRN AAG D 
PKDYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELTVAFKPGVS FHFNYQDI INHPDS I 
FDGYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQK 
VYDFLCHLENK 
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SEQ ID NO. 6220 

STRAIN 1169NT frame: 2 

IBCAFELERIGAFIAYEKQYKRKTEIQCDDKHLLAKIVHFLKYNSFTFPYIPKYREAAATF 
NEDGISLTSDFLSHTCTIETAKLIFKEGKILSAVKAFNKPAEVLVNDKRNAAGDRKDYFD 
YVMLNWSNTNSGYRLVMERLLGKAPSEQELTVGFKPGVSFHFTYQDIINHPDSIFDGYHP 
AKIKNQLSLAEHLVACVIPKHYQEDYQNLVPNDLKHRVYYLDYCNETLYEWNQKVYDFLC 
HLENK 

SEQ ID NO. 6221 

STRAIN JM9130013 frame: 1 

IGAFIAYEKQYKRKIEIQCDDKHLLTKIVHFLKYNSFTFPYIPKYREAAATFNEDGISLT 
SDFLSHTCTIETAKLIFKEGKILSAVPCAFNKPAEVLVNDKRNAAGDPKDYFDYVMLNWSN 
TNSGYRLVMERLLGKAPSEQELTVAFKPGVSFHFNYQDIINHPDSIFDGYHPAKIKNQLS 
LAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQKVYDFLCHLENK 

SEQ ID NO. 6222 

STRAIN 090 frame: 3 

DYPLIKAFELERIGAFIAYEKQYKRKIEIQCDDKHLLTKIVHFLKYNSFTFPYIPKYREA 
AATFNEDGISLTSDFLSHTCTIETAKLIFKEGKILSAVKAFNKPAEVLVNDKRNAAGDPK 
DYFDYVMLNWSNTNSGYRLVMERLLGKAPSEQELTVAFKPGVS FHFNYQDI INHPDS I FD 
GYHPAKIKNQLSLAEHLVACVIPKHYQEDYQSLVPNDLKHRVYYLDYCNETLYEWNQKVY 
DFLCHLENK 

SEQ ID NO. 6301 
STRAIN 2603 

AT GAAAAGT CGAAAAAAAG AT AAAT T GGTATTGAGGTTAACAAC AACACTATTGGT T T TT 
GGTTTGGGTGGGGTTTGGTTTTATAATTATAAAAATGATAATGTCGAACCGACAGTCACT 
AGT G CAT CGG AT C AAAC G AC G AC T T T TAT T C AAAC GAT T T C T C C AAC AG C TAT T G AAAT T 
T C T AAG AC C TAT GAT T T GT AT G C G T C AG T CT T AT T AG C AC AAGC T AT T T T GG AAT CAT C C 
AGT G G AC AAT C AG AT T T GT CT AAG GC T C C T AAT TAT AAC C T CT T T G G C AT C AAAGG AG AA 
T AT AAAGGT AAAT CT GT CCAAAT GCCTACTTT AGAAGAT G ATGGGAAAGGCAAT AT GACT 
CAAATCCAAGCTCCTTTTCGCGCCTATCCAAATTATTCTGCTTCACTATATGATTATGCT 
G AGT T AGT AT C T AG T C AAAAGT AT G CAT C T G T T T G G AAAT C AAAT AC CT C T T CT TAT AAG 
GAT G C T ACT G C AGC T CT AAC AGGT CT T T AT G C G AC AG AT AC T G C T TAT G CT AGT AAAT T A 
AAC C AAAT TAT T G AAAC C T AC AG T CT AG AT G C T T AT GAT AAA 

SEQ ID NO. 6302 

STRAIN 090 

GGGGTTTGGTTTTATAATTATAA 

AAAT GAT AAT GT C GAAC C G AC AG T C ACT AGT G CAT C GG AT C AAAC G AC GA 
C T T T T AT T C AAAC GAT T T CT C C AAC AG C TAT T G AAAT T T C T AAG AC C T AT 
GATTTGTATGCGTCAGTCTTATTAGCACAAGCTATTTTGGAATCATCCAG 
TGGACAATCAGATTTGTCTAAGGCTCCTAATTATAACCTCTTTGGCATCA 
AAG GAGAAT AT AAAG G T AAAT C T GT C C AAAT GC C T AC T T T AGAAGAT GAT 
GGGAAAGGCAATATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAA 
T TAT T C T G C T T C ACT AT AT GAT TAT G C T G AGT TAG TAT C TAG T C AAAAGT 
ATGCATCTGTTTGGAAATCAAATACCTCTTCTTATAAGGATGCTACTGCA 
G C T C T AAC AGGT C T T T AT G C G AC AG AT AC T G CT TAT G C TAG T AAAT T AAA 
C C AAAT TAT T G AAAC C T AC AG T C TAG AT G C T TAT GAT AAA 

SEQ ID NO. 6303 

STRAIN A909 

GGGGTTTGGTTTTATAATTATAA 

AAAT GAT AAT GT C GAAC C G AC AGT C ACT AGTG C AT C G GAT C AAAC G AC G A 
CTTTTATTCAAACGATTTCTCCAACAGCTATTGAAATTTCTAAGACCTAT 
GATTTGTATGCGTCAGTCTTATTAGCACAAGCTATTTTGGAATCATCCAG 
T GG AC AAT C AG AT T T G T CT AAG G CT C CT AAT TAT AAC C T CT T T GG C AT C A 
AAGG AG AAT AT AAAG G T AAAT C T GT C C AAAT G C C T AC T T TAG AAG AT GAT 
GGGAAAGGCAATATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAA 
T T AT T C T G C T T C AC TAT AT GAT TAT G C T GAG T TAG TAT C T AGT C AAAAG T 
ATGCATCTGCTTGGAAATCAAATACTTCTTCTTATAAGGATGCTACTGCA 
GCTCTAACAGGTCTTTATGCGACAGATACTGCTTATGCTAGTAAATTAAA 
C C AAAT T AT T G AAAC CT AC AGT C TAG AT G C T T AT GAT AAA 

t 
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SEQ ID NO. 6304 

STRAIN H3 6B 

GGGGTTTGGTTTTATAATTATAAAAATGATA 

AT GT CGAAC CG AC AGT C ACT AGT G CAT C GG AT C AAAC GAC G AC T T T TAT T 
C AAAC GAT T T CT C C AAC AGC T AT T G AAAT T T C T AAGAC CT AT GAT T T G T A 
T GCGT C AGT C T TAT TAG C AC AAGC T AT T T T GG AAT CAT C C AGT G GAC AAT 
CAGATTTGTCTAAGGCTCCTAATTATAACCTCTTTGGCATCAAAGGAGAA 
T AT AAAG GT AAAT CT GT C C AAAT G C C TAG T T T AG AAG AT GAT GGG AAAG G 
CAATATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAATTATTCTG 
CT T C AC TAT AT GAT TAT G CT G AGT TAG TAT C T AGT C AAAAGT a T GC AT C T 
GC T T GG AAAT CAAAT ACT T C T T C T T AT AAGG AT GC T AC T G C AGCT CT AAC 
AGGT C T T TAT GC GAC AG AT AC T G C T TAT G CT AG T AAAT T AAAC C AAAT T A 
T T GAAAC C T AC AGT CT AG AT G CT TAT GAT AAA 

SEQ ID NO. 6305 

STRAIN 18RS21 

GGGGTTTGGTTT TAT AAT T AT AAAAAT GAT AAT G 

T CGAAC C GAC AG T C ACT AGT G CAT C GG AT C AAAC GAC G ACT T T T AT T C AA 
AC GAT T T C T C C AAC AG C TAT T G AAAT TT C T AAG AC CT AT GAT T T GT AT G C 
GT C AGT C T TAT TAG C AC AAG C T AT T T T G GAAT CAT C C AG T GG AC AAT C AG 
ATTTGTCTAAGGCTCCTAATTATAACCTCTTTGGCATCAAAGGAGAATAT 
AAAG GT AAAT CT GT C CAAAT G C C T ACT T T AGAAG AT GAT G G G AAAGG C AA 
TATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAATTATTCTGCTT 
CACTATATGATTATGCTGAGTTAGTATCTAGTCAAAAGTATGCATCTGTT 
T G G AAAT CAAAT AC CT CT T C T TAT AAGG AT G C T AC T G C AG CT C T AAC AGG 
T C T T T AT GC G AC AG AT AC T G C T T AT G C T AGT AAAT T AAAC CAAAT TAT T G 
AAAC CT AC AGT C TAG AT G CT T AT GAT AAA 

SEQ ID NO. 6306 

STRAIN M732 

GGGGTTTGGTTTTATAATTATAA 

AAAT GAT AAT G T CG AAC CG AC AG T C AC TAG T G C AT C GG AT C AAAC GAC G A 
CTTTTATTCAAACGATTTCTCCAACAGCTATTGAAATTTCTAAGACCTAT 
GAT T T G T AT GCGT C AGT C T T AT TAG C AC AAG C T AT T T T GGAAT CAT C C AG 
TGGACAATCAGATTTGT'CTAAGGCTCCTAATTATAACCTCTTTGGCATCA 
AAGGAGAATATAAAGGTAAATCTGTCCAAATGCCTACTTTAGAAGATGAT 
GGGAAAGGC AAT ATGACT CAAAT CCAAGCT CCTTTTCGCGCCT AT CCAAA 
T T ATT CT GCTTC ACT AT ATGATTATGCTGAGTT AGT AT CT AGT C AAAAGT 
ATGCATCTGTTTGGAAATCAAATACTTCTTCTTATAAGGATGCTACTGCA 
GCTCTAACAGGTCTTTATGCGACAGATACTGCTTATGCTAGTAAATTAAA 
C CAAAT TAT T G AAAC CT AC AG T C T AG ATG C TT AT GAT AAA 

SEQ ID NO. 6307 

STRAIN COH1 

GGGGTTTGGTTTTATAATTATAA 

AAAT G AT AAT GT C GAAC C GAC AGT C AC T AGT G CAT C GG AT C AAAC GAC GA 
C T T T TAT T C AAAC GAT T T CT C C AAC AG CT AT T GAAAT T T CT AAG AC C T AT 
GAT T T G TAT GCGT C AGT C T TAT T AG C AC AAG C TAT T T T GGAAT CAT C C AG 
TGGACAATCAGATTTGTCTAAGGCTCCTAATTATAACCTCTTTGGCATCA 
AAGGAGAAT AT AAAGGT AAAT C T GT CCAAATGCCT ACT TTAGAAG AT GAT 
GGGAAAGGCAATATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAA 
T TAT T C T G C T T C ACT AT AT GAT T ATG C T G AGT TAG TAT CT AGT C AAAAGT 
ATGCATCTGTTTGGAAATCAAATACTTCTTCTTATAAGGATGCTACTGCA 
GCTCTAACAGGTCTTTATGCGACAGATACTGCTTATGCTAGTAAATTAAA 
C CAAAT TAT T G AAAC C T AC AG T C TAG AT G CT TAT GAT AAA 

SEQ ID NO. 6308 

STRAIN M781 

GGGGTTTGGTTTTATAATTATAAAAATGA 

T AAT GT C GAAC C GAC AGT C AC TAG T G CAT C GG AT C AAAC G ACG AC T T T T A 
T T C AAAC GAT T T C T C C AAC AGC TAT T G AAATT T CT AAG AC C T AT GAT T T G 
TATGCGTCAGTCTTATTAGCACAAGCTATTTTGGAATCATCCAGTGGACA 
ATCAGATTTGTCTAAGGCTCCTAATTATAACCTCTTTGGCATCAAAGGAG 
AATATAAAGGTAAATCTGTCCAAATGCCTACTTTAGAAGATGATGGGAAA 
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GGCAATATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAATTATTC 
T G CT T C ACT AT AT GAT TAT G C T G AGT T AGT AT CT AG T C AAAAGT AT G CAT 
CTGTTTGGAAATCAAATACTTCTTCTTATAAGGATGCTACTGCAGCTCTA 
AC AG GT CT T TAT G C GAC AGAT AC T G CT TAT G C T AGT AAAT T AAAC C AAAT 
TAT T GAAAC C T AC AGT C T AGAT G C T T AT GAT AAA 

SEQ ID NO. 6309 

STRAIN CJB110 

GGGGTTTGGTTT T AT AAT T AT AAAAAT G AT AAT GT 

C G AAC CG AC AGT C ACT AGT G CAT C G GAT C AAAC GAC G AC T T T TAT T C AAA 
CGATTTCTCCAACAGCTATTGAAATTTCTAAGACCTATGATTTGTATGCG 
T C AGT C T TAT TAG C AC AAG C T AT T T T G G AAT CAT C C AG T G GAC AAT C AGA 
TTTGTCTAAGGCTCCTAATTATAACCTCTTTGGCATCAAAGGAGAATATA 
AAG GT AAAT C T GT C C AAAT G C C T AC T T TAG AAG AT GAT G GG AAAGG C AAT 
ATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAATTATTCTGCTTC 
AC TAT AT GAT TAT G CT G AG T T AGT AT C T AGT C AAAAG T AT G CAT C T GT T T 
GGAAATCAAATACCTCTTCTTATAAGGATGCTACTGCAGCTCTAACAGGT 
CT T T AT G CG AC AGAT AC T G C T T AT G C TAG T AAAT T AAAC C AAAT TAT T G A 
AAC C T AC AGT C T AGAT GCT TAT GAT AAA 

SEQ ID NO. 6310 

STRAIN 1169NT 

GGGGTTTGGTTT TAT AAT TAT AAAAAT GAT AAT G T 

C G AAC AG AC AGT C AC T AGT G C AT CGG AT C AAAC GAC GACT T T TAT T C AAA 
CGATTTCCCCAACAGCTATTGAAATTTCTAAGACCTATGATTTGTATGCG 
T C AGT C T TAT TAG C AC AAG CT AT T T T G G AAT CAT C C AGT GG AC AAT C AG A 
TTTGTCTAAGGCTCCTAATTATAACCTCTTTGGCATCAAAGGAGAATATA 
AAG GT AAAT CT G T C C AAAT G C C T AC T T TAG AAG AT GAT G G G AAAG G C AAT 
ATGACTCAAATCCAAGCTCCTTTTCGCGCCTATCCAAATTATTCTGCTTC 
ACT AT AT GAT TAT GCT G AGT T AGT AT C T AGT C AAAAG TAT G CAT C T G T T T 
GG AAAT C AAAT ACT TCTTCTT AT AAGGATGCT ACT GCAGCTCT AAC AGGT 
CT T T AT G C GAC AG AT AC T G C T T AT G CT AGT AAAT T AAAC C AAAT TAT T G A 
AAC C T AC AGT CT AG AT GCT TAT GAT AAA 

SEQ ID NO. 6311 

STRAIN JM9130013 

T T T G GT T T TAT AAT TAT AAAAAT GAT AAT GT C G AAC C GAC AGT C AC T AGT 
G CAT C G GAT C AAAC GAC GACT T T T AT T C AAAC GAT T T C C C C AAC AGC T AT 
T G AAAT T T C T AAG AC C T AT GAT T T G T AT GC GT C AGT C T T AT TAG C AC AAG 
C T AT T T T GG AAT CAT C C AG T GG AC AAT C AG AT T T GT C T AAG G C T C C T AAT 
TAT AAC CT CT T T GG C AT C AAAG GAG AAT AT AAAG GT AAAT C T GT T C AAAT 
GC C T ACT T T AGAAG AT GAT G GG AAAGGT AAT AT GAC C C AAAT C C AAG CT C 
CTTTTCGCGCC TAT C C AAAT TAT T C T G CT T C AC TAT AT GAT TAT GCT GAG 
TTAGTATCTAGTCAAAAGTATGCATCTGTTTGGAAATCAAATACCTCTTC 
T T AT AAGGAT G C T AC T G C AG CT C T AAC AGGT C T T T AT G C GAC AG AT AC T G 
CT TAT G CT AGT AAAT T AAAC C AAAT TAT T G AAAAC T AC AGT C TAG AT GCT 
TAT GAT AAA 

SEQ ID NO. 6312 
STRAIN 2603 frame: 1 

MKS RKKDKL VLRLTTT LL VFGLGG VW FYNYKN DN VE PT VT S AS DQTTT FI QT I S PTAIE I 
SKTYDLYASVLLAQAILESSSGQSDLSKAPNYNLFGIKGEYKGKSVQMPTLEDDGKGNMT 
QIQAPFRAYPNYSASLYDYAELVSSQKYASVWKSNTSSYKDATAALTGLYATDTAYASKL 
NQI IET YSLDAYDK 

SEQ ID NO. 6313 

STRAIN 090 frame: 1 

GVWFYNYKNDNVEPTVTSASDQTTTFIQTISPTAIEISKTYDLYASVLLAQAILESSSGQ 
SDLSKAPNYNLFGIKGEYKGKSVQMPTLEDDGKGNMTQIQAPFRAYPNYSASLYDYAELV 
S S QKYAS VWKSNTS S YKDATAALTGLYAT DTAYAS KLNQI I ET YS LDAYDK 

SEQ ID NO. 6314 

STRAIN A90 9 frame: 1 

GVW FYN YKNDNVEPTVTS AS DQTTTFIQT I S PTAIEI SKTYDLYAS VLLAQAI LES S SGQ 
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SDLSKAPNYNLFGIKGEYKGKSVQMPTLEDDGKGNMTQIQAPFRAYPNYSASLYDYAELV 
SSQKYASAWKSNTSSYKDATAALTGLYATDTAYASKLNQIIETYSLDAYDK 

SEQ ID NO. 6315 

STRAIN H3 6B frame: 1 

GVW FYNYKN DNVE PT VT S AS DQTTT FI QT I S PT AI E I SKT YDL YAS VLLAQAI LE S S S GQ 
S DLS KAPN YNL FG IKGE YKGKS VQMPT LE D DGKGNMTQ I QAP FRAY PN YS AS L YD YAE LV 
SSQKYASAWKSNTSSYKDATAALTGLYATDTAYASKLNQIIETYSLDAYDK 

SEQ ID NO. 6316 

STRAIN 18RS21 frame: 1 

GVW FYNYKN DNVE PT VT SAS DQTTT FIQTISPTAIE IS KT YDL YAS VLLAQAI LE S S S GQ 
S DL S KAPN YNL FG IKGE YKGKS VQMPT LED DGKGNMTQ I QAP FRAY PNYS AS LYDYAELV 
SSQKYASVWKSNTSSYKDATAALTGLYATDTAYASKLNQIIETYSLDAYDK 

SEQ ID NO. 6317 

STRAIN M7 32 frame: 1 

GVW FYNYKN DNVE PT VT S AS DQTTT FI QT I S PT AI E I SKT YDL YAS VLLAQAI LE S S S GQ 
S D L S KAPN YN L FG I KGE YKGK S VQM P T LE D D GKGNMT Q I QAP FRAY PN Y S AS L Y D Y AE L V 
SSQKYASVWKSNTSSYKDATAALTGLYATDTAYASKLNQIIETYSLDAYDK 

SEQ ID NO. 6318 

STRAIN M781 frame: 1 

GVWFYN YKNDN VEPTVT SAS DQTTT FIQTISPTAIE I SKT YDLYAS VLLAQAI LESSSGQ 
SDLSKAPN YNL FG IKGE YKGKS VQMPTLEDDGKGNMTQI QAP FRAY PNYS AS LYDYAELV 
S S QK Y AS VW K S NT S S YK D AT AAL T G L Y AT D T A Y AS KLN QIIETYSL DAY DK 

SEQ ID NO. 6319 

STRAIN CJB110 frame: 1 

GVWFYN YKNDN VEPTVT SAS DQTTT FIQTISPTAIE I SKT YDL YAS VLLAQAI LESSSGQ 
SDLSKAPNYNLFGIKGEYKGKSVQMPTLEDDGKGNMTQIQAPFRAYPNYS AS LYDYAELV 
SSQKYASVWKSNTSSYKDATAALTGLYATDTAYASKLNQIIETYSLDAYDK 

SEQ ID NO. 6320 

STRAIN 1169NT frame: 1 

GVW FYNYKNDNVEQTVT SAS DQTTT FIQTISPTAIE I SKT YDLYAS VLLAQAI LESSSGQ 
SDLSKAPNYNLFGIKGEYKGKS VQMPT LEDDGKGNMTQIQAPFRAYPNYS AS LYDYAELV 
SSQKYASVWKSNTSSYKDATAALTGLYATDTAYASKLNQIIETYSLDAYDK 

SEQ ID NO. 6321 

STRAIN JM9130013 frame: 3 

W FYNYKN DNVE PTVT SAS DQTTT FIQTISPTAIE I SKT YDLYAS VLLAQAI LESSSGQSD 
LSKAPNYNLFGIKGEYKGKSVQMPTLEDDGKGNMTQIQAPFRAYPNYSASLYDYAELVSS 
QKYASVWKSNTSSYKDATAALTGLYATDTAYASKLNQIIENYSLDAYDK 

SEQ ID NO. 6401 
STRAIN 2603 

AT G AAC AAGT CT AAG AAAAT C G AAAAT TAT C AAT TAT TAT T AC T AC AAG C G C AAG C T C T A 
TTCTCAGATGAAACAAATGCTCTTGCCAACTTATCAAATGCTTCAGCTATGCTAAATGCT 
ATGCTTCCAAATTCTGTATTTACAGGCTTTTATTTATTTGATGGAGAAGAGTTAATTCTT 
GGCCCTTTCCAGGGTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGT 
GAATCTGCACAAACTGCTAAGACGCTGATCGTTGATGATGTTACAAAGCATGCTAACTAT 
ATCTCCTGTGATTCAAAAGCTATGAGTGAAATCGTAGTACCTATGTTTAAAAATGGCAAA 
C T T C T AGG AGT T C TAG AT T TAG AT TCTTCTTT AGT AG C AG AT TAT GAT GAG AT T GAT C AA 
GAATACTTAGAAAAATTTGTAGGTATTCTAGTAGAACATACGATTTGGAATTTGGATATG 
TTTGGAGTTGAAAAG 

SEQ ID NO. 6402 

STRAIN 090 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAACTTA 
TCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATTTAC 
AGGCTTTTATTTATTTGATGGAAAGGAGTTAATTCTTGGCCCTTTCCAGG 
GTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGTGAA 
T CT G C AC AAAC T G C T AAGAC G C T GAT T GT T GAT GAT GT T AC AAAG CAT G C 
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T AAC T AT AT C T C CT GT GAT T C AAAAG CT AT GAG T GAAAT C GT AGT AC C T A 
T GT T T AAAAAT GG C AAACT T CT AG GAGT T CT AG AT T T AGAT T C T T C T T T A 
GT AG C AG AT TAT GAT GAG AT T GAT C AAG AAT AC T T AG AAAAAT T T GT AG G 
TATTCTAGTAGAACATACGATTTGGAATTTGGATA 

SEQ ID NO. 6403 

STRAIN A90 9 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAA 

CTTATCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTAT 
TTACAGGCTTTTATTTATTTGATGGAGAAGAGTTAATTCTTGGCCCTTTC 
CAGGGTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGG 
T G AAT C T G C AC AAAC T G CT AAG ACG CT GAT CGT T GAT G AT GT T AC AAAGC 
ATGCTAACTATATCTCCTGTGATTCAAAAGCTATGAGTGAAATCGTAGTA 
CCTATGTTTAAAAATGGCAAACTTCTAGGAGTTCTAGATTTAGATTCTTC 
T T T AGT AG C AGAT TAT GAT GAG ATT GAT C AAG AAT AC T T AGAAAAAT T T G 

TAGGTATTCTAGTAGAACATACGATTTGGAATTTGGATATGTTTGGAGTT 
GAAAAG 

SEQ ID NO. 6404 

STRAIN H3 6B 

CTCTATTCTCAGATGAAACAAATGCTCTTGC 

CAACTTATCAAATGCTTCAGCTATGCTAAaTGCTATGCTTCCAAATTCTG 
TATTTACAGGCTTTTATTTATTTGATGGAGAAGAGTTAATTCTTGGCCCT 
TTCCAGGGTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTG 
TGGTGAATCTGCACAAACTGCTAAGACGCTGATCGTTGATGATGTTACAA 
AG CAT G CT AAC T AT AT C T C C T GT G AT T C AAAAG CT AT GAGT GAAAT CGT A 
GTACCTATGTTTAAAAATGGCAAACTTCTAGGAGTTCTAGATTTAGATTC 
T T CT T T AGT AG C AGAT TAT GAT G AGAT T GAT C AAG AAT AC T TAG AAAAAT 

TTGTAGGTATTCTAGTAGAACATACGATTTGGAATTTGGATATGTTTGGA 
GTT GAAAAG 

SEQ ID NO. 6405 

STRAIN 18RS21 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAACTT 

ATCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATTTA 
CAGGCTTTTATTTATTTGATGGAGAAGAGTTAATTCTTGGCCCTTTCCAG 
GGTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGTGA 
AT CTGC AC AAACT GCT AAG ACGCTGAT CGT TGATGATGTT AC AAAGC ATG 
C T AACT AT AT CT C C T G T GAT T C AAAAG CT AT GAGT GAAAT C GT AGT AC C T 
ATGTTTAAAAATGGCAAACTTCTAGGAGTTCTAGATTTAGATTCTTCTTT 
AGT AG C AG AT TAT GAT GAG AT T GAT C AAGAAT AC T T AGAAAAAT T T GT AG 

GTATTCTAGTAGAACATACGATTTGGAATTTGGATATGTTTGGAGTTGAA 
AAG 

SEQ ID NO. 6406 

STRAIN M732 

CT C TAT T C T C AG AT G AAAC AAAT GCTCTTGC C AAC T T 
ATCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATTTA 
CAGGCTTTTATTTATTTGATGGAGAGGAGTTAATTCTTGGCCCTTTTCAG 
G G T GG T GT AT CAT G T GT G CAT AT T AC T T T AGGAAAAGGT GTTTGTGGT G A 
AT C T G C AC AAAC T G C T AAG AC GCT GAT T G T T GAT GAT G T T AC AAAG CAT G 
CT AACT AT ATCTCCTGTGATTCAAAAGCT AT GAGT GAAAT CGT AGT ACCC 
AT GTT TAAAAATGGCAAACTTCTAGG AGT TCT AGAT TT AGAT TCTTCTTT 
AGT AG C AGAT TAT GAT GAG AT T GAT C AAG AAT AC T TAG AAAAAT T T GT AG 
GT AT T C T AGT AG AAC AT AC GAT T T GG AAT T T G GAT AT G T T T G G AGT T G AA 
AAG 

SEQ ID NO. 6407 

STRAIN COH1 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAAC 

TTATCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATT 
TACAGGCTTTTATTTATTTGATGGAGAGGAGTTAATTCTTGGCCCTTTTC 
AGGGTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGT 
GAAT C T G C AC AAAC T G CT AAG AC GCT GATT GTT GAT GAT G T T AC AAAG C A 
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TGCTAACTATATCTCCTGTGATTCAAAAGCTATGAGTGAAATCGTAGTAC 
C CAT GT T T AAAAAT G G C AAACT T CT AGG AGT T C TAG AT T TAG AT T C T T CT 
T T AGT AG C AG AT TAT GAT GAG AT TG AT C AAG AAT AC T T AG AAAAAT T T GT 
AG GT AT T C T AGT AG AAC AT ACG AT T T G GAAT T T G GAT ATGT TT G GAGT T G 
AAAAG 

SEQ ID NO. 6408 

STRAIN M7 81 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAACTT 

ATCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATTTA 

CAGGCTTTTATTTATTTGATGGAGAGGAGTTAATTCTTGGCCCTTTTCAG 

GGTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGTGA 

AT CT G C AC AAAC T G C T AAG AC G C T GAT T GT T GAT GAT G T T AC AAAG CAT G 

C T AACT AT AT C T C C T GT GAT T C AAAAG C TAT GAGT G AAAT CGT AGT AC C C 

ATGTTTAAAAATGGCAAACTTCTAGGAGTTCTAGATTTAGATTCTTCTTT 

AGT AG C AG AT TAT GAT GAG AT T GAT C AAG AAT ACT TAG AAAAAT T T G T AG 

GTATTCTAGTAGAACATACGATTTGGAATTTGGATATGTTTGGAGTTGAA 

AAG 

SEQ ID NO. 6409 

STRAIN CJB110 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAACTTA 
TCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATTTAC 
AGGCTTTTATTTATTTGATGGAAAGGAGTTAATTCTTGGCCCTTTCCAGG 
GTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGTGAA 
T C T G C AC AAAC T G CT AAG AC G CT GAT T GT T GAT GAT G T T AC AAAG CAT G C 
T AACT AT AT C T C C T G T GAT T C AAAAG C TAT GAGT G AAAT C GT AGT AC C T A 
TGTTTAAAAATGGCAAACTTCTAGGAGTTCTAGATTTAGATTCTTCTTTA 
GT AG C AG AT TAT GAT G AGAT T GAT C AAG AAT AC T TAG AAAAAT T T GT AGG 
TAT T CT AGT AG AAC AT AC GAT T T GG AAT T T G GAT AT GT T T G G AGT T G AAA 
AG 

SEQ ID NO. 6410 

STRAIN 1169NT 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAACTTA 

TCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATTTAC 

AGGCTTTTATTTATTTGATGGAGAAGAGTTAATTCTTGGCCCTTTCCAGG 

GTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGTGAA 

T CT G C AC AAACT G C T AAG AC G C T GAT T GT T GAT G AT GT T AC AAAG CAT G C 

T AAC TAT AT C T C C T GT GAT T C AAAAG CT AT G AGT GAAAT CGT AGT AC C C A 

TGTTTAAAAATGGCAAACTTCTAGGAGTTCTAGATTTAGATTCTTCTTTA 

GT AG C AGAT TAT GAT GAG AT T GAT C AAG AAT AC T TAG AAAAAT T T GT AGG 

TATTCTAGTAGAACATACGATTTGGAATTTGGATATGTTTGGAGTTGAAA 

AG 

SEQ ID NO. 6411 

STRAIN JM9130013 

CTCTATTCTCAGATGAAACAAATGCTCTTGCCAACTTA 

TCAAATGCTTCAGCTATGCTAAATGCTATGCTTCCAAATTCTGTATTTAC 

AGGCTTTTATTTATTTGATGGAGAAGAGTTAATTCTTGGCCCTTTCCAGG 

GTGGTGTATCATGTGTGCATATTACTTTAGGAAAAGGTGTTTGTGGTGAA 

TCTGCACAAACTGCTAAGACGCTGATCGTTGATGATGTTACAAAGCATGC 

T AAC TAT AT C T C CT GT GAT T C AAAAG CT AT GAG T GAAAT CGTAGTACCTA 

TGTTTAAAAATGGCAAACTTCTAGGAGTTCTAGATTTAGATTCTTCTTTA 

GTAGCAGATT ATGAT GAGATT GAT CAAGAAT ACT T AGAAAAATTT GT AGG 

TAT T C T AGT AG AAC AT AC GAT T T GG AAT T T G GAT AT G T T T G GAGT T G AAA 

AG 

SEQ ID NO. 6412 
STRAIN 2603 frame: 1 

MNKSKKIENYQLLLLQAQALFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELIL 
GPFQGGVSCVHITLGKGVCGESAQTAKTLIVDDVTKHANYISCDSKAMSEIVVPMFKNGK 
LLGVLDLDSSLVADYDEIDQEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6413 
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STRAIN 090 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGKELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIWPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLD 

SEQ ID NO. 6414 

STRAIN A90 9 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIWPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6415 

STRAIN H3 6B frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIWPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6416 

STRAIN 18RS21 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIWPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6417 

STRAIN M732 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIVVPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6418 

STRAIN COH1 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIWPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6419 

STRAIN M781 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIVVPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6420 

STRAIN M7 81 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIVVPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6421 

STRAIN CJB110 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGKELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIVVPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6422 

STRAIN 1169NT frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIVVPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 

SEQ ID NO. 6423 

STRAIN JM9130013 frame: 3 

LFSDETNALANLSNASAMLNAMLPNSVFTGFYLFDGEELILGPFQGGVSCVHITLGKGVC 
GESAQTAKTLIVDDVTKHANYISCDSKAMSEIVVPMFKNGKLLGVLDLDSSLVADYDEID 
QEYLEKFVGILVEHTIWNLDMFGVEK 
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SEQ ID NO. 6501 
STRAIN 2603 

ATGAAAAAGAGTACCCAAATAATACTACTAATAGTTGCA 

T TAT T CAT ACT T GT T T T T AGC G GAG GAT T T TAT AT G AAAG AAC AAC AAAG AAAAGAAG AA 
CTAAAACGGAATCGAGAATATGAAGTTAGTCTAGTCAAAGCATTGAAAAATTCCTATGAG 
AATATAGAAGAAATAAAAATCACACATCCTGTTTCAACTGAAATTCCTGGAGATTGGCAT 
TGT ACTGT AAAGATTTCAT TTAATGAT AAAAAAT CT ATT GTT T AT AATATTACACAT AAT 
T T G GAAT C GAAAAAAAAT T AT AG CGGAAAATT T AAT GAAAAAAAT AT G AAT T T T T T T GAT 
T C AAG AAT T GG T AAAAC AAAAAAAAC T AT AAAAAT T AT T T T T T C AGAT GG T C AGG AGAAG 
ATACAA 

SEQ ID NO. 6502 

STRAIN 090 

GGAGGATTTTATATGAAAGAACA 

AC AAAG AAAAGAAG AAC T AAAAC GG AAT C GAG AAT AT G AAGT TAG T C T AG 
T C AAAG CAT T GAAAAAT T C C TAT GAG AAT AT AG AAGAAAT AAAAAT C AC A 
CAT CCTGTTTCAACTGAAATTCCTGGAGATTGGCAT TGT ACTGT AAAGAT 
T T CAT T T AAT GAT AAAAAAT C T AT T GT T TAT AAT AT T AC AC AT AAT T T G G 
AAT C GAAAAAAAAT TAT AG C G G AAAT T T T AAT GAAAAAAAT AT GAAT T T T 
T T T GAT T C AAG AAT T GG T AAAAC AAAAAAAACT AT AAAAAT TAT T T T T T C 
AG At GG t C AGG AG AAG AT a C AA 

SEQ ID NO. 6503 

STRAIN A909 

GG AGG AT TT TAT AT G AAAGAAC AACAA 

AG AAAAGAAGAACT AAAAC G GAAT CG AG AAT AT G AAG T T AG T C T AGT C AA 
AG CAT T GAAAAAT T C C TAT GAG AAT AT AG AAG AAAT AAAAAT C AC AC AT C 
CTGTTTCAACTGAAATTCCTGGAGATTGGCATTGTACTGTAAAGATTTCA 
TT TAAT GAT AAAAAAT CT ATT GT TTAT AAT ATT AC ACAT AAT TT GGAAT C 
GAAAAAAAAT TATAG CGGAAAATT TAAT GAAAAAAAT AT GAAT TTTTTTG 
AT T C AAG AAT T G GT AAAAC AAAAAAAAC TAT AAAAAT TAT t T T T T C AG AT 
GG t C AGG AG AAG AT AC AA 

SEQ ID NO. 6504 

STRAIN H3 6B 

GGAGGATTTTATATGAAAGAACA 

AC AAAG AAAAGAAG AAC T AAAACGG AAT C GAG AAT AT G AAG T T AGT C T AG 
T C AAAG CAT T GAAAAAT T C CT AT GAG AAT AT AG AAG AAAT AAAAAT C AC A 
CAT C C T GT T T C AAC T G AAAT T C C T GG AG AT T GG C AT T GT AC T GT AAAG AT 
TT CAT TTAAT GAT AAAAAAT CT AT TGTTT AT AAT ATT AC ACAT AAT TTGG 
AAT C GAAAAAAAAT TATAG C G G AAAAT T TAAT GAAAAAAAT AT GAAT T T T 
T T T GAT T C AAG AAT T G G T AAAAC AAAAAAAAC TAT AAAAAT T A t T T T T T C 
AG AT GG t C AGG AGAAG AT a CAA 

SEQ ID NO. 6505 

STRAIN 18RS21 

GG AGG AT TT T AT AT GAAAGAACAAC 

AAAGAAAAGAAGAACTAAAACGGAATCGAGAATATGAAGTTAGTCTAGTC 
AAAGCATT GAAAAATT CCTATGAGAAT AT AGAAGAAAT AAAAAT CACAC A 
TCCTGTTTCAACTGAAATTCCTGGAGATTGGCATTGTACTGT AAAGAT TT 
CAT T TAAT GAT AAAAAAT CTATTGT TTAT AAT AT TACACATAATTT GGAA 
T CGAAAAAAAAT TAT AGC GGAAAATTTAAT GAAAAAAAT AT GAATT TT TT 
T GAT T C AAG AAT T G GT AAAAC AAAAAAAAC TAT AAAAAT TAT T T T T T C AG 
ATGGtCAGGAGAAGATaCAA 

SEQ ID NO. 6506 

STRAIN M7 81 

GG AGG AT T T TAT AT G AAAGAAC AACAAAG AAAA 

G AAG AAC TAAAACG GAAT C GAG AAT AT GAAGT T AGT C T AGT C AAAG CAT T 
GAAAAAT T C CT AT GAG AAT AT AG AAGAAAT AAAAAT CACAC AT CCT GTT T 
CAACTGAAATTCCTGGAGATTGGCATTGTACTGTAAAGATTTCATTTAAT 
GAT AAAAAAT CT AT T G T T TAT AAT AT T AC AC AT AAT T T G GAAT C G AAAAA 
AAAT TAT AGC G G AAAAT T T AAT GAAAAAAAT AT GAAT T T T T T T GAT T CAA 
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GAATTGGTAAAACAAAAAAAACTATAAAAATTATTTTTTCAGATGGTCAG 
GAGAAGATACAA 

SEQ ID NO. 6507 

STRAIN CJB110 

G G AGGAT T T TAT AT G AAAG AAC AAC AAAG AAAAG AAG AA 
CT AAAACG GAAT CG AG AAT AT G AAGT T AGT CT AGT C AAAG CAT T GAAAAA 
T T C CT AT G AGAAT AT AGAAG AAAT AAAAAT C AC AC AT C CT GT T T C AACT G 
AAAT T C CT GG AG AT T GGC AT T GT ACT G T AAAG AT T T CAT T T AAT GAT AAA 
AAATCTATTGTTTATAATATTACACATAATTTGGAATCGAAAAAAAATTA 
TAG C G G AAAT T T T AAT G AAAAAAAT AT GAAT T T T T T T GAT T C AAG AAT T G 
GTAAAACAAAAAAAAC TAT AAAAAT TAT T T T T T C AG AT G GT C AGG AG AAG 
ATACAA 

SEQ ID NO. 6508 

STRAIN 1169NT 

GG AGGAT TTT AT ATGAAAGAAC AAC AAAG 

AAAAG AAG AAC TAAAAC G GAAT C GAG AAT AT GAAG T TAG T CT AG T C AAAG 
CAT T G AAAAAT T CCT ATGAGAAT AT AGAAGAAAT AAAAAT C AC ACAT C CT 
GTTTCAACTGAAATTCCTGGAGATTGGCATTGTACTGTAAAGATTTCATT 
TAATGATAAAAAATCTATTGTTTATAATATTACACATAATTTGGAATCGA 
AAAAAAATT AT AGTGGAAAATTTAATG AAAAAAAT AT GAATTTTTTTGAT 
T CAAG AAT T GGTAAAACAAAAAAAAC TAT AAAAAT TAT T T T T T C AGAT GG 
T C AGG AG AAGAT AC AA 

SEQ ID NO. 6509 

STRAIN JM9130013 

GGAGGATT T TAT AT GAAAGAACAAC 

AAAG AAAAG AAGAAC T AAAACG GAAT CG AG AAT AT G AAGT TAG T C T AGT C 
AAAGC ATT GAAAAATT CCT ATGAGAAT AT AGAAGAAAT AAAAAT CACACA 
TCCTGTTTCAACTGAAATTCCTGGAGATTGGCATTGTACTGTAAAGATTT 
CAT T T AAT GAT AAAAAAT C TAT T GT T TAT AAT AT T AC AC AT AAT T T G G AA 
T C G AAAAAAAAT T AT AG C G G AAAAT T T AAT G AAAAAAAT AT GAAT t T T TT 
T GAT T CAAG AAT T GGTAAAACAAAAAAAAC TAT AAAAAT TAT T T T T T C AG 
At GG t C AGG AG AAG AT AC AA 

SEQ ID NO. 6510 
STRAIN 2 603 frame: 1 

MKKSTQIILLIVALFILVFSGGFYMKEQQRKEELKRNREYEVSLVKALKNSYENIEEIKI 
THPVSTEIPGDWHCTVKISFNDKKSIVYNITHNLESKKNYSGKFNEKNMNFFDSRIGKTK 
KT IKI I FS DGQEKIQ 

SEQ ID NO. 6511 

STRAIN 0 90 

GGFYMKEQQRKEELKRNREYEVSLVKALKNSYENIEEIKITHPVSTEIPGD 

WHCTVKI S FNDKKS I VYNI THNLESKKNYSGNFNEKNMNFFDSRIGKTKKT IKI I FS DGQ 

EKIQ 

SEQ ID NO. 6512 

STRAIN A909 

GG FYMKEQQRKEELKRNREYEVSLVKALKNS YEN IEEIKITHPVSTEIPGDWH 

CTVKI S FNDKKS I VYN I THNLE S KKN YSGKFNEKNMN FFDSRI GKTKKT IKI I FS DGQEK 

IQ 

SEQ ID NO. 6513 

STRAIN H3 6B 

GGFYMKEQQRKEELKRNREYEVSLVKALKNSYENIEEIKITHPVSTEIPGD 

WHCTVKI S FNDKKS I VYNITHNLES KKN YSGKFNEKNMNFFDSRIGKTKKT IKI I FS DGQ 

EKIQ 

SEQ ID NO. 6514 

STRAIN 18RS21 

GGFYMKEQQRKEELKRNREYEVSLVKALKNSYENIEEIKITHPVSTEIPGDW 
HCTVKISFNDKKSIVYNITHNLESKKNYSGKFNEKNMNFFDSRIGKTKKTIKIIFSDGQE 
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KIQ 

SEQ ID NO. 6515 

STRAIN CJB110 

GGFYMKEQQRKEELKRNREYEVSLVKALECNSYENIEEIKITHPVSTEIPGDWHCTVK 
ISFNDKKSIVYNITHNLESKKNYSGNFNEKNMNFFDSRIGKTKKTIKIIFSDGQEKIQ 

SEQ ID NO. 6516 

STRAIN JM9130013 

GGFYMKEQQRKEELKRNREYEVSLVKALKNSYENIEEIKITHPVSTEIPGDW 

HCTVKI S FNDKKS IVYNI THNLESKKNYSGKFNEKNMNFFDSRIGKTKKT IKI I FS DGQE 

KIQ 

SEQ ID NO. 6517 

STRAIN 1169NT frame: 1 

GGFYMKEQQRKEELKRNREYEVSLVKALKNSYENIEEIKITHPVSTEIPGDWHCTVKISF 
NDKKSIVYNITHNLESKKNYSGKFNEKNMNFFDSRIGKTKKTIKIIFSDGQEKIQ 

SEQ ID NO. 6518 

STRAIN M7 81 frame: 1 

GGFYMKEQQRKEELKRNREYEVSLVECALKNSYENIEEIKITHPVSTEIPGDWHCTVKISF 
NDKKS I VYN I THNLE SKKN YSGKFNEKNMNFFDSRI GKTKKT IKI I FS DGQEKIQ 

SEQ ID NO. 6601 
STRAIN 2603 

T T G AC AAG GC AT AT AAAAAT T T C T AT AC T AAAT T T AC AAAAT G AAGG AG AGG G AACT AT G 
G AAAT AC T GAT T G C AGGT G G T AGT GGT T T T T T AGG AAAG C AG AT AAT AAAAG C AG CG C T T 
ACAAAAGGGCATAAAGTGGCTT ACTT AT C AAGACATGAAGGT AAAGGT GAT AT AT TT AAG 
GATCCTAGATTAACCTACATTAGGGGAGATATTACAGAAGCTGATAAGATTCATTTAGAA 
GAC AG AAC T T T T GAT AT AT T AAT T G AC T GT AT T G GAG C GAT T AAG C C C AAT C AAC TAG AT 
G AG CT T AACGT T AAAGC AAC C C AAAAAG C AGT AGC ACT C T GT C AC AAAAAT C AAAT AC C A 
AAGTTAGTTTATATTTCAGCCAACAGCGGCTATTCAGCTTACATTAAAAGTAAAAGGAAG 
G C AG AG C AGAT AAT C AAAG CAAG C GG T C TGG AT TAT CT T T T T G T AAG AC C AG G T T T GATG 
TATGGTGAAGAGCGACCTCTCTCGATTTTCCAAGCCAAGTGTATAAAGTTATTTAGTCAT 
TTGCCTTTCTTAGGTATTGTTGTACAAAAGGTCTTTCCAACTAAGGTTGTGATAGTGGCA 
GAAG C AAT CGT T AC T AC G C T T AGGAAAAAAC C AAC C C AAAAAAT C CT TT C T AT T G AAGAA 
TT AAAT AAT AAA 

SEQ ID NO. 6602 
STRAIN 090 

AC AAG G CAT AT AAAAAT T T CT AT ACT AAAT T T AC AAAAT 
GAAGGAGAGGGAACTATGGAAATACTGATTGCAGGTGGTAGTGGTTTTTT 
AGG AAAG C AG AT AAT AAAAG C AG C G C T T AC AAAAGGG C AT AAAGT GG CT T 
ACT TAT CAAG AC AT G AAG GT AAAGGT GAT AT AT T T AAG GAT C CT AG AT T A 
ACCTACATTAGGGGAGATATTACAGAAGCTGATAAGATTCATTTAGAAGA 
C AG AACT T T T GAT AT AT T AAT T G ACT GT AT T G G AG CG AT T AAG C C C AAT C 
AAC TAG AT G AG CTT AAC GT T AAAG C AAC C C AAAAAG C AGT AG C AC T C T GT 
C AC AAAAAT C AAAT AC C AAAGT T AGT T TAT AT T T C AG C C AAC AG C GG C T A 
T T C AG CT T AC AT T AAAAG TAAAAGG AAG G C AG AG C AG AT AAT C AAAG C AA 
GCGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGTGAAGAG 
CGACCTCTCTCGATTTTCCAAGCCAAGTGTATAAAGTTATTTAGTCATTT 
GCCTTTCTTAGGTATTGTTGTACAAAAGGTCTTTCCAACTAAGGTTGTGA 
T AGT GG C AG AAG C AAT CGT T AC T AC G CT T AGG AAAAAAC C AAC C C AAAAA 
AT C CTTT CT ATT GAAGAAT T AAAT AAT AAA 

SEQ ID NO. 6603 
STRAIN A909 

AC AAGGC AT AT AAAAAT TTCT AT ACT AAAT TT AC AAAAT G 
AAGGAGAGGGAACTATGGAAATACTGATTGCAGGTGGTAGTGGTTTTTTA 
G G AAAG C AG AT AAT AAAAG C AG C G CT T AC AAAAG G G C AT AAAGT G G C T T A 
CTTATCAAGACATGAAGGTAAAGGTGATATATTTAAGGATCCTAGATTAA 
C C T AC AT TAG G G GAG AT AT T AC AG AAG C T GAT AAG AT T CAT T TAG AAG AC 
AGAACT T T T GAT AT AT T AAT T GAC T G T AT T G GAG C GAT T AAGC C C AAT C A 
ACT AGATGAGCTT AACGT T AAAGC AACCCAAAAAGC AGT AGC ACT CTGTC 
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ACAAAAATCAAATACCAAAGTTAGTTTATATTTCAGCCAACAGCGGCTAT 
T CAGCT TACATTAAAAGT AAAAGGAAGGCAGAGCAGAT AAT CAAAGCAAG 
CGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGTGAAGAGC 
GACCTCTCTCGATTTTCCAAGCCAAGTGTATAAAGTTATTTAGTCATTTG 
CCTTTCTTAGGTATTGTTGTACAAAAGGTCTTTCCAACTAAGGTTGTGAT 
AGT GG C AG AAGC AAT C GT T AC T ACGC T T AGG AAAAAAC CAAC C C AAAAAA 
T C CT T T C TAT T G AAG AAT T AAAT AAT AAA 

SEQ ID NO. 6604 
STRAIN H36B 

T AT AAAAAT T T C TAT ACT AAAT T T AC AAAAT G AAG GAGAG GG AAC T AT G G 
AAATACTGATTGCAGGTGGTAGTGGTTTTTTAGGAAAGCAGATAATAAAA 
G C AG C G CT T AC AAAAG GG C AT AAAG T GG CT T AC T TAT C AAG AC AT G AAG G 
TAAAGGTGATATATTTAAGGATCCTAGATTAACCTACATTAGGGGAGATA 
T T AC AG AAG C T GAT AAG AT T CAT T T AG AAGAC AGAACT T T T GAT AT AT T A 
AT T GAC T GT AT T GGAG CG AT T AAG C C C AAT CAAC TAG AT GAG C T T AAC G T 
T AAAG CAAC C C AAAAAG C AGT AG C AC T CT G T C AC AAAAAT C AAAT AC C AA 
AGT T AGT T TAT AT T T C AG C CAAC AG CG G CT AT T C AGC T T AC AT T AAAAG T 
AAAAGG AAG G C AG AGC AG AT AAT C AAAG C AAGC G G T C T G GAT TAT C T T T T 
TGTAAGACCAGGTTTGATGTATGGTGAAGAGCGACCTCTCTCGATTTTCC 
AAGCCAAGTGTATAAAGTTATTTAGTCATTTGCCTTTCTTAGGTATTGTT 
GTACAAAAGGTCTTTCCAACTAAGGTTGTGATAGTGGCAGAAGCAATCGT 
TACT ACGCT TAGGAAAAAAC CAACC CAAAAAAT C CTTT CT ATT GAAGAAT 
TAAATAATAAA 

SEQ ID NO. 6605 
STRAIN 18RS21 

ACAAGGCAT ATAAAAATT T CT AT ACT AAAT TT AC AAAAT 
GAAGGAGAGGGAACTATGGAAATACTGATTGCAGGTGGTAGTGGTTTTTT 
AG G AAAGC AG AT AAT AAAAG C AGC G C T T AC AAAAG G G CAT AAAGT G G C T T 
ACT T AT CAAGAC ATGAAGGT AAAGGTGAT AT ATT T AAGGAT CCT AGATTA 
ACCTACATTAGGGGAGATATTACAGAAGCTGATAAGATTCATTTAGAAGA 
C AG AACT T T T GAT AT AT T AAT T G ACT G TAT T G G AG C GAT T AAG C C C AAT C 
AAC TAG AT G AG CT T AAC G T T AAAGC AAC C C AAAAAG C AGT AGC AC T C T GT 
CACAAAAATCAAATACCAAAGTTAGTTTATATTTCAGCCAACAGCGGCTA 
T TCAGCTTAC ATT AAAAGT AAAAGGAAGGCAGAGCAGAT AAT CAAAGCAA 
GCGGT CT GGATTATCTT TTT GT AAGAC CAGGTTT GAT GT AT GGTGAAGAG 
CGACCTCTCTCGATTTTCCAAGCCAAGTGTATAAAGTTATTTAGTCATTT 
GCCTTTCTTAGGTATTGTTGTACAAAAGGTCTTTCCAACTAAGGTTGTGA 
TAGTGGCAGAAGCAATCGTTACTACGCTTAGGAAAAAACCAACCCAAAAA 
AT C CT T T C T AT T G AAG AAT T AAAT a AT AAA 

SEQ ID NO. 6606 
STRAIN M732 

CAAAATGAAGGAGAgGGAACT ATG gAAAT ACTGATTGCAG GT GGT AGT GG 
T T T T C T AG G G AAG C AG AT AAT AAAAG C AG C G CT T AC AAAAG G G C AT AAGG 
T G G CT TACT TAT C AAGG C AT GAAG GT AAAG GTG AT AT AT T T AAG GAT C c T 
AGATT AAC CT ACATT AAGGGAGAT AT T ACAGAAGCT GAT AAGAT T C AT TT 
AG a AC AT AG AAAT TTT GAT AT AT T AAT T GAC T GT AT T GGAG C GAT T AAG C 
C C AAT CAAC TAG AT GAG C T T AAC G T T AAAG CAAC C C AAAAAG C AG TAG C A 
CT CT GT CACAAAAAT C AAAT ACC AAAGT TAGTTT ACATT T CAGCCAAT AG 
C G G CT AT T C AG C T TAG AT T AAAAG T AAAAG GAAG G C AG AG C AG AT AAT C A 
AAGCAAGCGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGT 
G AAG AG CG AC CT C T CT C GAT TTT C C AAG C C AAGT G TAT AAAAT TAT T TAG 
TCATTTGCCTTTCTTAGGTATTGTTGTACAAAAAGTCTTTCCAACTAAGG 
TTGTGATAGTGGCAGAAGCAATCGTTACTTCGCTTAGGAAAAAACCAACT 
CAAAAAAT C C T T T C T AT T GAAG AAT TAAATAATAAA 

SEQ ID NO. 6607 
STRAIN COH1 

ACAAGGCAT AT AAAAAT TTCTAT ACT AAATTT AC 

AAAAT GAAG G AG AGG GAAC T AT GG AAAT AC T GAT T G C AG GTG G T AGT G GT 
TTT CT AG G GAAG C AG AT AAT AAAAG C AG CG C T T AC AAAAG G G CAT AAG GT 
GGCTTACTTATCAAGGCATGAAGGTAAAGGTGATATATTTAAGGATCCTA 
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GATTAACCTACATTAAGGGAGATATTACAGAAGCTGATAAGATTCATTTA 
G AAC AT AGAAAT T T T GAT AT AT T AAT T G ACT GT AT T G GAG C GAT T AAG C C 
C AAT C AAC T AG AT GAGCT T AAC GT T AAAG C AAC C C AAAAAG C AGT AG C AC 
TCTGTCACAAAAATCAAATACCAAAGTTAGTTTACATTTCAGCCAATAGC 
GGCTATTCAGCTTACATTAAAAGTAAAAGGAAGGCAGAGCAGATAATCAA 
AGCAAGCGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGTG 
AAG AG C G AC CT CT C T C GAT T T T C C AAG C C AAGT GT AT AAAAT T AT T T AGT 
CATTTGCCTTTCTTAGGTATTGTTGTACAAAAAGTCTTTCCAACTAAGGT 
TGTGATAGTGGCAGAAGCAATCGTTACTTCGCTTAGGAAAAAACCAACTC 
AAAAAAT C CTT T CT AT T GAAGAATT AAAT AAT AAA 

SEQ ID NO. 6608 
STRAIN M7 81 

ACAAGGCATATAAAAATTTcTATACTAAATTTaCA 

AAAT G AAGG AG AG G G AACT AT G G AAAT AC T GAT T G C AGG T GGT AGT GG T T 
T T CT AG G G AAG C AG AT AAT AAAAG C AG C G C T T AC AAAAG GG C AT AAG GT G 
G C T T AC T TAT C AAG G CAT G AAGGT AAAG GT GAT AT AT T T AAG GAT C C T AG 
ATTAACCTACATTAAGGGAGATATTACAGAAGCTGATAAGATTCATTTAG 
AACATAGAAATTTTGATATATTAATTGACTGTATTGGAGCGATTAAGCCC 
AATCAACTAGATGAGCTTAACGTTAAAGCAACCCAAAAAGCAGTAGCACT 
C T G T C AC AAAAAT C AAAT AC C AAAGT T AG T T T AC AT T T C AG C C AAT AG CG 
G C TAT T C AG C T T AC AT T AAAAG TAAAAGG AAG G C AG AG C AGAT AAT C AAA 
GCAAGCGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGTGA 
AG AG C G AC CT CT C T CG AT T T T C C AAG C C AAGT GT AT AAAAT TAT T T AGT C 
ATTTGCCTTTCTTAGGTATTGTTGTACAAAAAGTCTTTCCAACTAAGGTT 
GTGATAGTGGCAGAAGCAATCGTTACTTCGCTTAGGAAAAAACCAACTCA 
AAAAATCCTTTCTAtTGAAGAATTAAATAATAAA 

SEQ ID NO. 6609 
STRAIN 1169NT 

AC AAGGCATAT AAAAATTT CT ATACT AAAT TT ACAAA 
AT G AAG G AG AGGG AAC T AT GG AAAT AC T GAT T G C AGGT G GT AGT GGT T T T 
T TAG G AAAG C AG AT AAT AAAAGC AG C GC T T AC AAAAG G G CAT AAGT T G G C 
TTACTTATCAAGACATGAAGGTAAAGGTGATATATTTAAGGATCCTAGAT 
T AACCTAC AT T AAGGGAGATAT TACAGAAGCT GAT AAG ATT CAT T T AGAA 
G AC AG AAC T T T T GAT AT AT T AAT T G AC T GT AT T GG AG C G AT T AAG C C C AA 
T C AAC TAG AT GAG C T T AAC GT T AAAG C AAC C C AAAAAG C AGT AGC AC T C T 
GT C AC AAAAAT C AAAT AC C AAAGT T AGT T T AC AT T T C AG C C AAC AG C G G C 
TAT T CAGCTT AC AT T AGAAGTAAAAGGAAGGCAGAGC AGAT AAT CAAAGC 
AAGCGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGTGAAG 
AG C G AC C T C T CT CG AT T T T C C AAG C C AAGT GT AT AAAAT TAT T T AGT CAT 
TTGCCTTT CTT AGGT ATTGTTGTACAAAAGGT CTT TCCAACT AAGGT TGT 
GAT AGT G G C AG AAG C AAT C GT T AC T AC G CT T AGGAC AAAAC C AAC T C AAA 
AAATCCTTTCTATTGAAGAATTAAATAATAAA 

SEQ ID NO. 6610 
STRAIN CJB110 

AC AAG G CAT AT AAAAAT T T CT AT ACT AAAT T T AC AAA 
ATGAAGGAGAGGGAACTATGGAAATACTGATTGCAGGTGGTAGTGGTTTT 
T T AG G AAAG C AG AT AAT AAAAG C AG C G C T T A C AAAAG G G CAT AAAG T GG C 
TTACTTATCAAGACATGAAGGTAAAGGTGATATATTTAAGGATCCTAGAT 
TAACCTACATTAGGGGAGATATTACAGAAGCTGATAAGATTCATTTAGAA 
GACAGAACTTTTGATATATTAATTGACTGTATTGGAGCGATTAAGCCCAA 
T C AACT AG AT G AG CT T AAC G T T AAAG C AAC C C AAAAAG C AGT AG C AC T C T 
GT CAC AAAAAT C AAAT AC C AAAGT TAGTT TAT AT TTCAGCC AAC AG CGGC 
TAT T C AG C T T AC AT T AAAAG T AAAAG G AAGGC AG AG C AG AT AAT C AAAG C 
AAGCGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGTGAAG 
AGCGACCTCTCTCGATTTTCCAAGCCAAGTGTATAAAGTTATTTAGTCAT 
TTGCCTTTCTTAGGTATTGTTGTACAAAAGGTCTTTCCAACTAAGGTTGT 
GAT AG T GG C AG AAG C AAT C GT T AC T ACG CT T AGG AAAAAAC C AAC C C AAA 
AAAT CCTTTCT AT TGAAGAATT AAAT AAT AAA 

SEQ ID NO. 6611 
STRAIN JM9130013 
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AC AAG GC AT AT AAAAAT T T C T AT AC T AAAT T T AC AAAAT G 
AAGGAGAGGGAACTATGGAAATACTGATTGCAGGTGGTAGTGGTTTTTTA 
GG AAAG C AG AT AAT AAAAG C AG C G C T T AC AAAAGGGC AT AAAGT GG CT T A 
CT T AT C AAGAC AT GAAGGT AAAGG T GAT AT AT T T AAG GAT C CT AGAT T AA 
C C T AC AT TAG G GGAGAT AT T AC AG AAG C T G AT AAGAT T CAT T T AGAAGAC 
AGAACTTTTGATATATTAATTGACTGTATTGGAGCGATTAAGCCCAATCA 
ACT AGAT GAG C T T AAC GT T AAAG C AAC C C AAAAAG C AGT AG C ACT CT GT C 
ACAAAAATCAAATACCAAAGTTAGTTTATATTTCAGCCAACAGCGGCTAT 
T C AG C T T AC AT TAAAAGT AAAAG G AAGG C AGAG C AG AT AAT C AAAG C AAG 
CGGTCTGGATTATCTTTTTGTAAGACCAGGTTTGATGTATGGTGAAGAGC 
GACCTCTCTCGATTTTCCAAGCCAAGTGTATAAAGTTATTTAGTCATTTG 
CCtTTCTTAgGTATTGTTGTACAAAAGGTCTTTCCAACTAAGGTTGTGAT 
AGT G G C AGAAGC AAT C GT TACT AC G CT TAG G AAAAAAC C AAC C C AAAAAA 
TCCTTTCTATTGAAGAATTAAATAATAAA 

SEQ ID NO. 6612 
STRAIN 2 603 frame: 1 

TRHIKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKD 
PRLTYIRGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYISANSGYSAYIKSKRKAEQIIKASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHL 
PFLGIWQKVFPTKWIVAEAIVTTLRKKPTQKILSIEELNNK 

SEQ ID NO. 6613 

STRAIN 0 90 frame: 1 

TRHIKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKD 
PRLTYIRGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYISANSGYSAYIKSKRKAEQIIKASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHL 
PFLGIWQKVFPTKVVIVAEAIVTTLRKKPTQKILSIEELNNK 

SEQ ID NO. 6614 

STRAIN A90 9 frame: 1 

TRHIKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKD 
PRLTYIRGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYISANSGYSAYIKSKRKAEQIIKASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHL 
PFLGIWQKVFPTKWIVAEAIVTTLRKKPTQKILSIEELNNK 

SEQ ID NO. 6615 

STRAIN H3 6B frame: 2 

IKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKDPRL 
TYIRGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPKLVY 
ISANSGYSAYIKSKRKAEQIIKASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHLPFL 
GIWQKVFPTKWIVAEAIVTTLRKKPTQKILSIEELNNK 

SEQ ID NO. 6616 

STRAIN 18RS21 frame: 1 

TRHIKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKD 
PRLTYIRGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYI SANSGYSAYIKSKRKAEQI IKASGLDYLFVRPGLMYGEERPLS I FQAKCIKLFSHL 
PFLGIWQKVFPTKWIVAEAIVTTLRKKPTQKILSIEELNNK 

SEQ ID NO. 6617 

STRAIN M732 frame: 1 

QNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKDPRLTYIKGDIT 
EADKIHLEHRNFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPKLVYISANSGYS 
AYIKSKRKAEQI IKASGLDYLFVRPGLMYGEERPLS I FQAKCIKLFSHLPFLGIVVQKVF 
PTKWIVAEAIVTSLRKKPTQKILSIEELNNK 

SEQ ID NO. 6618 

STRAIN COH1 frame: 1 

TRHIKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKD 
PRLTYIKGDITEADKIHLEHRNFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYISANSGYSAYIKSKRKAEQI IKASGLDYLFVRPGLMYGEERPLS I FQAKCIKLFSHL 
P FLG I VVQKV FPTKWI VAE AI VT S LRKKPT QKI LS I EE LNNK 
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SEQ ID NO. 6619 

STRAIN M781 frame: 1 

TRHIKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKVAYLSRHEGKGDIFKD 
PRLTYIKGDITEADKIHLEHRNFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYISANSGYSAYIKSKRKAEQIIICASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHL 
PFLGIWQKVFPTKWIVAEAIVTSLRKKPTQKILSIEELNNK 

SEQ ID NO. 6620 

STRAIN 1169NT frame: 1 

TRHIKISILNLQNEGEGTMEILIAGGSGFLGKQIIKAALTKGHKLAYLSRHEGKGDIFKD 
PRLTYIKGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYISANSGYSAYIRSKRKAEQIIKASGLDYLFVRPGLMYGEERPLSIFQAKCIKLFSHL 
PFLGIWQKVFPTKVVIVAEAIVTTLRTKPTQKILSIEELNNK 

SEQ ID NO. 6621 

STRAIN CJB110 frame: 1 

TRHIKI S I LNLQNEGEGTMEILI AGGSGFLGKQI IKAALTKGHKVAYLSRHEGKGDI FKD 
PRLTYIRGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHPCNQIPK 
LVYI SANSGYS AYIKSKRKAEQI IKASGLDYLFVRPGLMYGEERPLS I FQAKCIKLFSHL 
PFLGIWQKVFPTKWIVAEAIVTTLRKKPTQKILSIEELNNK 

SEQ ID NO. 6622 

STRAIN JM9130013 frame: 1 

TRHIKI S I LNLQNEGEGTME I LI AGGSGFLGKQI IKAALTKGHKVAYLSRHEGKGDI FKD 
PRLTYIRGDITEADKIHLEDRTFDILIDCIGAIKPNQLDELNVKATQKAVALCHKNQIPK 
LVYI SANSGYSAYIKSKRKAEQI IKASGLDYLFVRPGLMYGEERPLS I FQAKCIKLFSHL 
PFLGIWQKVFPTKWIVAEAIVTTLRKKPTQKILSIEELNNK 

SEQ ID NO. 6701 
STRAIN 090 

C AAT AAC AAC AT T T G AAAAT AAAAAAGT T T T AGT CCTTGGTT TAG C AC G A 
TCTGGAGAAGCCGCTGCACGTTTGTTAGCTAAGTTAGGAGCAATAGTGAC 
AGT T AAT GAT G G C AAAC CAT T T GAT G AAAAT C C AAC AG C AC AGT CT T T GT 
TGGAAGAGGGTATTAAAGTGGTTTGTGGTAGTCATCCTTTAGAATTGTTA 
GAT G AGG AT T T T T GT T AC AT GAT T AAAAAT C C AGG AAT AC C T TAT AAC AA 
T CCT AT GGT C AAAAAAG CAT T AG AAAAAC AAAT CCCTGTTTT G AC T G AAG 
TGGAATTAGCATACTTAGTTTCAGAATCTCAGCTAATAGGTATTACAGGC 
T CT AAC G G G AAAAC G AC AAC G AC AAC GAT GAT T G C AG AAGT C T T AAAT G C 
T GGAGGT C AG AG AG GTTTGTTAGCTGG GAAT AT CGG CT T T C C T G C T AGT G 
AAGT TGT T C AGGC T G C G GAT G AT AAAG AT AT T C TAG T TAT GG AAT TAT C A 
AGT T T T C AG C T AAT GG G AG T T AAG GAAT TTCGTCCT CAT AT T G C AGT AAT 
TACTAAT T T AATGCC AACT CATTT AGAT T AT C AT GGGT CTTT TGAAGAT T 
AT GTTGCT GC AAAATGGAAT AT C CAAAAT CAAAT GT CT T C AT CT GAT TTT 
TTGGTACTTAATTTTAATCAAGGTATTTCTAAAGAGTTAGcTAAAACTAC 
TAAAGCAACAATCGTTCCTTTCTCTACTACGGAAAAAGTTGATGGTGCTT 
AC GT AC AAG AC AAG C AAC T T T T C T AT AAAGGG G AG AAT AT TAT GT T AGT A 
GATGACATTGGTGTCCCAGGAAGCCATAACGTAGAGAATGCTCTAGCAAC 
TAT TGCGGTTGC T AAAC TAG C T GGT AT C AGT AAT C AAGT TAT T AGAG AAA 
CTTTAAGCAATTTTGGAGGTGTTAAACACCGCTTGCAATCACTCGGTAAG 
GTTCATGGTATTAGTTTCTATAACGACAGCAAGTCAACTAATATATTGGC 
AACTCAAAAAGCATTATCTGGCTTTGATAATACTAAAGTTATCCTAATTG 
C AG GAG G T CT T GAT C G C G GT AAT G AGT T T GAT GAAT T GAT AC C AG AT AT C 
AC T GG AC T T AAAC AT AT GGT T GT T T TAG GG GAAT CGG CAT C T C GAG T AAA 
AC G T G CT G C AC AAAAAG C AGG AGT AAC T TAT AG C GAT G C T T T AG ATG T T A 
GAG AT G C GGT AC AT AAAG C T T AT GAG GT G G C AC AAC AG G G C GAT G T TAT C 
T T G C T AAGT CCT G CAAAT G C AT C AT GGG AC AT GT AT AAG AAT T T C G AAG T 
CCGTGGTGATGAATTCATTGATACtTTCGAAAGTCTTAGAGGAGAG 

SEQ ID NO. 6702 
STRAIN A90 9 

CAAT AACAACATTTGAAAAT AAAAAAGT TTT AGT CCT T GGT T TAGCACG A 
T C T GGAG AAG CT G C T G C AC GT T T GT T AG CT AAG T T AG GAG C AAT AG T G AC 
AG T T AAT GAT GG C AAAC CAT T T GAT G AAAAT C C AAC AG C AC AGT CTTT GT 
TGGAAGAGGGTATTAAAGTGGTTTGTGGTAGTCATCCTTTAGAATTGTTA 
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GAT GAGGAT T TTTGTTACAT GAT TAAAAATCCAGGAAT ACCT T ATAACAA 
T C C T AT G GT C AAAAAAGC AT T AGAAAAAC AAAT CCCTGTTTT G ACT G AAG 
T GGAATT AGCATACTTAGTT T CAGAAT CT CAGCT AATAGGT ATTAC AGGC 
T CT AAC G G GAAAAC GAC AAC G AC AAC GAT GAT T G C AG AAGT CT T AAAT G C 
TGGAGGTCAGAGAGGTTTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTG 
AAGTT GTT CAGGCT GCGAAT GAT AAAGAT ACT CTAGTTAT GGAATT AT CA 
AGT T T T C AG CT AAT G GG AGT T AAG GAAT TTCGTCCT CAT AT T G C AGT AAT 
T AC T AAT T T AAT GC C AAC T CAT T TAG AT TAT CAT GGGTCTTTT GAAGAT T 
ATGTTGCTGCAAAATGGAATATCCAAAATCAAATGTCTTCATCTGATTTT 
T T G GT AC T T AAT T T T AAT C AAGG T AT T T C T AAAG AGT T AG CT AAAAC T AC 
T AAAGCaACAAT CGT T C CTTT CT CT ACT ACGGAAAAAGT T GAT GGTGCT T 
AC GT AC AAGAC AAG C AAC T T T T C T AT AAAGGG G AGAAT AT T AT G T C AGT A 
GAT GAC AT TGGTGTCC C AGG AAG C C AT AAC GT AnAGAAT G C T C T AGC AAC 
TATTGCGGTTGCTAAACTGGCTGGTATCAGTAATCAAGTTATTAgAGAAA 
CTTTAAGCAATTTTGGAGGtGTTAAACACCGCTTGCAATCACTCGGTAAG 
GTTCATGGTATTAGTTTCTATAACGACAGCAAGTCAACTAATATATTGGC 
AACT C AAAAAG CAT TAT CTGGCTTT G AT AAT ACT AAAGT T AT C C T AAT T G 
C AGG AG GT CT T GAT C GCG GT AAT GAGT T T GAT GAAT T G AT AC C AGAT AT C 
ACTGGACTTAAACATATGGTTGTTTTAGGGGAATCGGCATCTCGAGTAAA 
ACGT G C T G C AC AAAAAGC AG GAG T AAC T TAT AG C GAT G CT T T AG AT GT T A 
GAGATGCGGTACATAAAGCTTATGAGGTGGCACAACAGGGCGATGTTATC 
TTGCTAAGTCCTGCAAATGCATCATGGGACATGTATAAGAATTTCGAAGT 
C CGT G GT GAT GAAT T CAT T GAT ACT T T C G AAAGT C T TAG AGG AG AG 

SEQ ID NO. 6703 
STRAIN H3 6B 

GG AC GAGT AAT GAAAAC AAT AAC AAC AT T T G AAAAT 

AAAAAAGTTTTAGTCCTTGGTTTAGCACGATCTGGAGAAGCTGCTGCACG 
T T T GT T AGC T AAGT T AGG AG C AAT AGT GAC AGT T AAT GAT G GC AAAC CAT 
T T GAT G AAAAT C C AAC AG C AC AG T C T T T GT T GGAAGAG GGT AT T AAAGT G 
GTTTGTGGTAGTCATCCTTTAGAATTGTTAGATGAGGATTTTTGTTACAT 
GAT T AAAAAT C C AGG AAT AC C T TAT AAC AAT C C T AT GGT C AAAAAAG C AT 
T AGAAAAAC AAATCCCTGTTTTGACTGAAGTGGAATT AGCATACTTAGTT 
T CAGAAT CT C AG CT AAT AGGT AT T AC AG G C T C T AAC G G GAAAAC GAC AAC 
GACAACGATGATTGCAGAAGTCTTAAATGCTGGAGGTCAGAGAGGTTTGT 
TAGCTGGGAATATCGGCTTTCCTGCTAGTGAAGTTGTTCAGGCTGCGAAT 
GAT AAAGAT ACT CTAGTTATGGAATT AT CAAGTTTTCAGCTAATGGGAGT 
TAAGGAATTTCGTCCTCATATTGCAGTAATTACTAATTTAATGCCAACTC 
AT T T AGAT TAT C AT GGGT C T T T T GAAGAT TAT G T T G C T G C AAAAT G GAAT 
ATCCAAAATCAAATGTCTTCATCTGATTTTTTGGTACTTAATTTTAATCA 
AGGT ATTTCT AAAG AGTT AG CTAAAACT ACT AAAG C AAC AAT CGTTCCTT 
T C T C T AC T AC G G AAAAAG T T GAT GGTGCT T AC GT AC AAG AC AAG C AAC T T 
TTCTATAAAGGGGAGAATATTATGTCAGTAGATGACATTGGTGTCCCAGG 
AAGC C AT AAC GT AG AG AAT G CT C T AG C AACT AT TGCGGTTGC T AAAC T G G 
CT G GT AT C AG T AAT C AAG T TAT TAG AG AAAC T T T AAG C AAT T T T G G AGG T 
GTTAAACACCGCTTGCAATCACTCGGTAAGGTTCATGGTATTAGTTTCTA 
TAACGACAGCAAG 

SEQ ID NO. 6704 
STRAIN 18RS21 

GG AC GAGT AAT GAAAAC AAT AAC AAC AT TT G 

AAAATAAAAAAGTTTTAGTCCTTGGTTTAGCACGATCTGGAGAAGCTGCT 
GCACGTTTGTTAGCTAAGTTAGGAGCAATAGTGACAGTTAATGATGGCAA 
AC CAT T T GAT G AAAAT C C AAC AG C AC AG T C T T T GT T GGAAGAG G GT AT T A 
AAGTGGTTTGTGGTAGTCATCCTTTAGAATTGTTAGATGAGGATTTTTGT 
TAC AT GAT T AAAAAT CC AGG AAT ACCTT AT AAC AAT CCTATGGTCAAAAA 
AGCATTAGAAAAACAAATCCCTGTTTTGACTGAAGTGGAATTAGCATACT 
TAGTTTC AGAAT CT CAGCT AAT AGGT ATT ACAGGCTCTAACGGGAAAACG 
AC AAC GAC AAC GAT GAT T G C AG AAGT C T T AAAT G C T GG AG G T C AG AG AG G 
TTTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTGAAGTTGTTCAGGCTG 
C GAAT GAT AAAGAT AC T C T AG T T AT GG AAT TAT C AAGT T T T C AG C T AAT G 
GGAGTTAAGGAATTTCGTCCTCATATTGCAGTAATTACTAATTTAATGCC 
AACTCATTTAGATTATCATGGGTCTTTTGAAGATTATGTTGCTGCAAAAT 
GGAAT ATCC AAAAT CAAATGTCTTCATCTGATTTTTTGGTACTTAATTTT 
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AATCAAGGTATTTCTAAAGAGTTAGCTAAAACTACTAAAGCAACAATCGT 
TCCTTTCTCTACTACGGAAAAAGTTGATGGTGCTTACGTACAAGACAAGC 
AACTTTTCTATAAAGGGGAGAATATTATGTCAGTAGATGACATTGGTGTC 
CCAGGAAGCCATAACGTAGAGAATGCTCTAGCAACTATTGCGGTTGCTAA 
ACT GG C T GGT AT C AGT AAT C AAGT T AT T AG AG AAACT T T AAG C AATT T T G 
GAGGTGTTAAACACCGCTTGCAATCACTCGGTAAGGTTCATGGTATTAGT 
T T CT AT AAC GAC AG C AAGT C AACT AAT AT AT T GG C AAC T C AAAAAGC AT T 
ATCTGGCTTTGATAATACTAAAGTTATCCTAATTGCAGGAGGTCTTGATC 
GCGGTAATGAGTTTGATGAATTGATACCAGATATCACTGGACTTAAACAT 
ATGGTTGTTTTAGGGGAATCGGCATCTCGAGTAAAACGTGCTGCACAAAA 
AG C AG G AGT AACT TAT AG C GAT G CT T TAG AT GT T AG AGAT GC G G T AC AT A 
AAGCTTATGAGGTGGCACAACAGGGCGATGTTATCTTGCTAAGTCCTGCA 
AAT G CAT CAT GGG AC AT GT AT AAG AAT TT C GAAGT C CG T GGT GAT G AAT T 
CATTGATACTTTCGAAAGTCTTAGAGGAGAG 

SEQ ID NO. 6705 
STRAIN M732 

GGACGAGT AAT GAAAACAAT AAC AAC AT T T G AAA 

ATAAAAAAGTTTTAGTCCTTGGTTTAGCACGATCTGGAGAAGCCGCTGCA 
C GT T T GT T AG C T AAG T T AG GAG C AAT AGT GAC AGT T AAT GAT G G C AAAC C 
AT T T GAT G AAAAT CC AAC AGC AC AGT C TT T GT T GG AAG AG GGT AT T AAAG 
TGGTTTGTGGTAGTCATCCTTTAGAATTGTTAGATGAGGATTTTTGTTAC 
ATGATTAAAAATCCAGGAATACCTTATAACAATCCTATGGTCAAAAAAGC 
AT T AG AAA?^AC AAAT CCCTGTTTT GAC T GAAGT GGAAT TAG CAT AC T TAG 
T TT C AG AAT CT C AG CT AAT AGG T AT T AC AG G C T CT AACGGG AAAAC GAC A 
AC G AC AACGAT GAT T G C AG AAGT C T T AAAT G C T GG AGGT C AGAG AG GT T T 
GTTAGCTGGGAATATCGGCTTTCCTGCTAGTGAAGTTGTTCAGGCTGCGG 
a T GAT AAAG AT AT T CT AG T T AT G G AAT TAT C AAGT T T T C AG C T AAT G G GA 
GTTAAGGAATTTCGTCCTCATATTGCAGTAATTACTAATTTAATGCCAAC 
TCAtTTAGATTATCATGGGTCTTTTGAAGATTATGtTGCTGCAAAATGGA 
ATATCCAAAATCAAATGTCTTCATCTGATTTTTTGGTACTTAATTTTAAT 
C AAG GT ATT T CT AAAG AGT T AGC TAAAACT ACT AAAG C AAC Aa T C GT T CC 
TTTCTCTACTACGGAAAAAGTTGATGGTGCTTACGTACAAGACAAGCAAC 
T T T T CT AT AAAG GG GAG AAT AT TAT GT C AGT AG AT GAC AT TGGTGTCC C A 
GGAAGC CAT AACGTAGAGAATGCTCT AGC AACT ATT GCGGTTGCT AAACT 
AG C T GGT AT CAGT AAT C AAGT TAT T AGAG AAAC T T T AAG C AAT T T T G GAG 
G T GT T AAAC AC CGCT T G C AAT C AC T C GGT AAG G T T CAT GGT AT TAG T T T C 
T AT AACGACAGCAAGTCAACTAAT AT ATTGGCAACTC AAAAAGC ATT AT C 
TGGCTTTGATAATACTAAAGTTATCCTAATTGCAGGAGGTCTTGATCGCG 
GTAATGAGTTTGATGAATTGATACCAGATATCACTGGACTTAAACATATG 
GTTGTTTTAGGGGAATCGGCATCTCGAGTAAAACGTGCTGCAC AAAAAGC 
AGGAGTAACT T AT AGCGAT GCTT T AGATGT TAGAGAT GCGGT ACAT AAAG 
CTTATGAGGTGGCACAACAGGGCGATGTTATCTTGCTAAGTCCTGCAAAT 
G CAT C AT GG GAC AT GT AT AAG AAT T T C GAAGT C C GT G GT GAT G AAT T CAT 
T GAT AC T T T C G AAAGT C T T AG AGG AG AG 

SEQ ID NO. 6706 
STRAIN COH1 

G GAC G AGT AAT GAAAACAAT AAC AAC AT T T G A 

AAAT AAAAAAGT T T TAG TCCTTGGTT T AG C AC GAT C T GG AG AAG C C GC T G 
C AC G T T T GT T AG CT AAG T T AGG AG C AAT AGT GAC AGT T AAT GAT GG C AAA 
C C AT T T GAT G AAAAT CC AAC AG C AC AGT CTTTGTTG G AAG AG G G T AT T AA 
AGT GGT TTGT GGT AGT CAT CCTTTAGAATTGTT AGAT GAGGATTTTTGTT 
AC AT GAT T AAAAAT C C AG G AAT AC C T TAT AAC AAT C C TAT GGT C AAAAAA 
G CAT TAG AAAAAC AAAT C C C T GT T T T G AC T G AAG T GGAAT TAG CAT AC T T 
AGT T T C AG AAT CT C AG CT AAT AGG TAT T AC AG GC T CT AAC G G G AAAAC G A 
C AAC GAC AAC GAT GAT T G C AG AAG T CT T AAAT G C T GG AGGT C AG AG AG G T 
TTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTGAAGTTGTTCAGGCTGC 
GG a T GAT AAAG AT AT T C T AG T TAT G G AAT T AT C AAGT T T T C AG C T AATG G 
GAGTTAAGGAATTTCGTCCTCATATTGCAGTAATTACTAATTTAATGCCA 
ACTCATTTAGATTATCATGGGTCTTTTGAAGATTATGTTGCTGCAAAATG 
GAAT AT CCAAAAT CAAAT GT CTT C AT CT GAT T TT TT GGT ACT T AAT T T T A 
ATCAAGGTATTTCTAAAGAGTTAGCTAAAACTACTAAAGCAaCAATCGTT 
CCTTTCTCTACTACGGAAAAAGTTGATGGTGCTTACGTACAAGACAAGCA 
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ACTTTTCTATAAAGGGGAGAATATTATGTCAGTAGATGACATTGGTGTCC 
CAGGAAGCCATAACGTAGAGAATGCTCTAGCAACTATTGCGGTTGCTAAA 
C T AG CT GGT AT C AGT AAT C AAGT T AT T AGAG AAAC T T T AAG C AAT TT T G G 
AGGTGTTAAACACCGCTTGCAATCACTCGGTAAGGTTCATGGTATTAGTT 
TCTATAACGACAGCAAGTCAACTAATATATTGGCAACTCAAAAAGCATTA 
TCTGGCTTTGATAATACTAAAGTTATCCTAATTGCAGGAGGTCTTGATCG 
CGGT AAT GAGTTTGATGAAT TGAT ACCAGAT AT C ACT GGACTT AAACATA 
TGGTTGTTTTAGGGGAATCGGCATCTCGAGTAAAACGTGCTGCACAAAAA 
G C AGG AGT AAC T TAT AG C GAT G C T T TAG AT GT T AG AG AT GCG GT AC AT AA 
AGCTTATGAGGTGGCACAACAGGGCGATGTTATCTTGCTAAGTCCTGCAA 
ATGCATCATGGGACATGTATAAGAATTTCGAAGTCCGTGGTGATGAATTC 
ATTGATACTTTCGAAA 

SEQ ID NO. 6707 
STRAIN M781 

GGACGAGTAATGAAAACAATAACAACATT 

T GAAAAT AAAAAAGT TT T AGT CCTT GGT TTAGC ACGATCTGGAGAAGC CG 
CTGCACGTTTGTTAGCTAAGTTAGGAGCAATAGTGACAGTTAATGATGGC 
AAAC C ATTTG AT GAAAAT CC AAC AGC AC AGTCTTTGTTGGAAGAGGGT AT 
TAAAGTGGTTTGTGGTAGTCATCCTTTAGAATTGTTAGATGAGGATTTTT 
GTTACATGATTAAAAATCCAGGAATACCTTATAACAATCCTATGGTCAAA 
AAAGC AT T AGAAAAAC AAAT CCCTGTTTT G AC T G AAGT GG AAT T AGC AT A 
CTTAGTTTCAGAATCTCAGCTAATAGGTATTACAGGCTCTAACGGGAAAA 
C G AC AACG AC AAC GAT GAT T G C AG AAGT C T T AAAT G C T G G AGGT C AG AGA 
GGTTTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTGAAGTTGTTCAGGC 
TGCGGATGATAAAGATATTCTAGTTATGGAATTAT CAAGTTTTCAGCTAA 
TGGGAGTTAAGGAATTTCGTCCTCATATTGCAGTAATTACTAATTTAATG 
CC AACT C ATTTAGATT AT CATGGGT CTTTTGAAGATT ATGTTGCTGCAAA 
ATGGAAT AT C CAAAAT CAAAT GT CT T C AT CT GAT T TT T TGGT ACT T AATT 
T T AAT C AAG GT AT T T C T AAAG AG T T AG CT AAAACT AC T AAAG C Aa C AAT C 
GTTCCTTTCTCTACTACGGAAAAAGTTGATGGTGCTTACGTACAAGACAA 
G C AAC T T T T CT AT AAAG GG GAG AAT AT T AT GT C AGT AG AT G AC AT T G G T G 
T C C C AG G AAG C CAT AAC GT AG AG AAT G C T CT AG C AACT AT TGCGGTTGCT 
AAACT AG C T GGT AT C AG T AAT C AAGT T AT T AG AGAAACT T T AAG C AAT T T 
TGGAGGTGTTAAACACCGCTTGCAATCACTCGGTAAGGTTCATGGTATTA 
G T T T C TAT AAC G AC AG C AAGT C AACT AAT AT AT T G G C AAC T C AAAAAG C A 
TTATCTGGCTTTGATAATACTAAAGTTATCCTAATTGCAGGAGGTCTTGA 
T CGCGGT AATGAGTTT GAT G AAT TGAT ACCAGAT AT CACTGG ACT T AAAC 
ATATGGTTGTTTTAgGGGAATCGGCATCTCGAGTAAAACGTGCTGCACAA 
AAAG CAGG AGT a AC T TAT AG C GAT G C T T TAG AT GT TAG AG AT G C GG T AC A 
TAAAGCTTATGAGGTGGCACAACAGGGCGATGTTATCTTGCTAAGTCCTG 
CAAAT G CAT CAT G G G AC AT GT AT AAG AAT T T CG AAG TCCGTGGT GAT G AA 
T T CAT T GAT ACT T T C G AAAG T C T T AGAGG AG AG 

SEQ ID NO. 6708 
STRAIN CJB110 

GGACGAGT AATGAAAACAAT AACAACAT T T GA 

AAATAAAAAAGTTTTAGTCCTTGGTTTAGCACGATCTGGAGAAGCCGCTG 
C AC GT T T GT TAG C T AAGT TAG GAG C AAT AGT GAC AGT T AAT GAT GG C AAA 
C C AT T T GAT GAAAAT C C AAC AG C AC AGT C T T T GT T GG AAG AG G G T AT T AA 
AGTGGTTTGTGGTAGTCATCCTTTAGAATTGTTAGATGAGGATTTTTGTT 
AC AT G AT T AAAAAT C C AG GAAT AC C T TAT AAC AAT C C TAT GGT C AAAAAA 
G C AT TAG AAAAAC AAAT CCCTGTTTT G ACT G AAG T GG AAT T AG CAT AC T T 
AGTTTCAGAATCTCAGCTAATAGGTATTACAGGCTCTAACGGGAAAACGA 
C AAC GAC AAC GAT GAT T G C AG AAGT C T T AAAT G CT GGAGGT C AG AG AG GT 
TTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTGAAGTTGTTCAGGCTGC 
GGATGATAAAGATATTCTAGTTATGGAATTATCAAGTTTTCAGCTAATGG 
GAGTTAAGGAATTTCGTCCTCATATTGCAGTAATTACTAATTTAATGCCA 
ACTCATTTAGATTATCATGGGTCTTTTGAAGAATATGTTGCTGCAAAATG 
GAATATCCAAAATCAAATGTCTTCATCTGATTTTTTGGTACTTAATTTTA 
ATCAAGGTATTTCTAAAGAGTTAGCTAAAACTACTAAAGCAACAATCGTT 
CCTTTCTCTACTACGGAAAAAGTTGATGGTGCTTACGTACAAGACAAGCA 
ACT T T T C T AT AAAG G GG AG AAT AT TAT G T T AGT AG AT GAC AT TGGTGTCC 
CAGG AAG C CAT AAC GT AG AG AAT G C T C T AGC AAC TAT TGCGGTTG C T AAA 
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CTAGCTGGT AT CAGT AAT C AAGT TAT TAGAGAAACTTT AAGCAATTT T GG 
AGGTGTTAAACACCGCTTGCAATCACTCGGTAAGGTTCATGGTATTAGTT 
T CT AT AAT G AC AG C AAGT C AACT AAT AT AT T GG C AAC T C AAAAAG CAT T A 
TCTGGCTTTGATAATACTAAAGTTATCCTAATTGCAGGAGGTCTTGATCG 
CGGT AATGAGT T T GAT GAAT T GAT AC C AGAT AT C AC T GG AC T T AAAC AT A 
TGGTTGTTTTAGGGGAATCGGCATCTCGAGTAAAACGTGCTGCACAAAAA 
G C AG G AGT AACT T AT AG C GAT GC T T T AGAT GT T AG AG AT GCGGT AC AT AA 
AGCTTATGAGGTGGCACAACAGGGCGATGTTATCTTGCTAAGTCCTGCAA 
AT G CAT CAT G GG AC AT GT AT AAG AAT T T CG AAGT C CGT G GTGAT GAAT T C 
ATTGATACTTTCGAAAGTCTTAGAGGAGAG 

SEQ ID NO. 6709 
STRAIN 1169NT 

C AAT AAC AAC AT T T G AAAAT AAAAAAGT T T T AGT C C T T G GT T T AG C AC G A 
TCTGGAGAAGCCGCTGCACGTTTGTTAGCTAAGTTAGGAGCAATAGTGAC 
AGT T AAT GAT G G C AAAC CAT T T GAT G AAAAT C C AAC AG C AC AGT CT T T GT 
TGGAAGAGGGTATTAAAGTGGTTTGTGGTAGTCATCCTTTAGAATTGTTA 
GAT GAG GAT T T T T G T TAG AT GAT T AAAAAT C C AG GAAT AC C T TAT AAC AA 
TCCTATGGTCAAAAAAGCATTAGAAAAACAAATCCCTGTTTTGACTGAAG 
TGGAATTAGCATACTTAGTTTCAGAATCTCAGCTAATAGGTATTACAGGC 
T CT AAC GGG AAAACGAC AAC GAC AACGAT GAT T G C AG AAGT CT T GAAT G C 
TGGAGGTCAGAGAGGTTTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTG 
AAGTTGTTCAGGCTGCGGATGATAAAGATACTCTAGTTATGGAATTATCA 
AGTTTTCAGCTAATGGGAGTTAAGGAATTTCGTCCTCATATTGCAGTAAT 
TACTAATTTAATGCCAACTCATTTAGATTATCATGGGTCTTTTGAAGAtT 
AT G t TGCTGCAAAATGGAAT AT CCAAAAT C AAATGT CT T CAT CT GAT TT T 
T T GGT AC T T AAT T T T AAT C AAGGT AT T T C T AAAGAGT T AG c T AAAAC T AC 
TAAAGCAACAATCGTTCCTTTCTCTACTACGGAAAAAGTTGATGGTGCTT 
ACGTACAAGACAAGCAACTTTTCTATAAAGGGGAGAATATTATGTCAGTA 
GAC GAC AT TGGTGTCC C AG G AAG C C AT AACGT AGAGAAT G CT CT AG C AAC 
TAT TGCGGTTGC T AAAC T AG CT G GT AT CAGT AAT C AAGT TAT TAG AG AAA 
CTTTAAGCAATTTTGGAGGTGTT AAAC ACCGCTTGCAATCACT CGGT AAG 
GTTCATGGTATTAGTTTCTATAACGACAGTAAGTCAACTAATATATTGGC 
AACTCAAAAAGCATTATCTGGCTTTGATAATACTAAAGTTATCCTAATTG 
C AGG AGGT CT T GAT C GC GG T AAT G AGT T T GAT GAAT T GAT AC C AG AT AT C 
ACT GG AC T T AAG CAT AT GGT T G T T T T AG G GG AAT CG G CAT CT C G AGT AAA 
ACGTGCTGCACAAAAAGCAGGAGTAACTTATAGCAATGCTTTAgATGTTA 
GAgATGCgGTACATAAAGCTTATGAGGTGGCACAACAGGGCGATGTTATC 
TT GTTmAGT c CTGCGAATGCAT CAT GGGACAT GT AT AAGAAT T T CGAAGT 
C CGT G GT GAT GAAT T CAT T GAT ACT T T C G 

SEQ ID NO. 6710 

STRAIN JM9130013 

GGACGAGT AAT GAAAAC AATAACAACA 

TTTGAAAATAAAAAAGTTTTAGTCCTTGGTTTAGCACGATCTGGAGAAGC 
T G C T G C AC GT T T GT T AG CT AAG T TAG GAG C AAT AGT GAC AGT T AAT GAT G 
GC AAAC CAT T T GAT G AAAAT C C AAC AGC AC AGT C T T T G T T GG AAGAGGG T 
AT T AAAGT G GT T T GT G GT AGT CAT C CT T TAG AAT T G t T AG AT GAG GAT T T 
T T G T T AC AT GAT T a AAAAT C C AGGAAT AC C T TAT AAC AAT C C T AT GGT C A 
AAAAAG CAT TAG AAAAACAAAT CCCTGTTTT G ACT G AAGT GGAAT T AG C A 
TACTT AGT TTCAGAATCTCAGCT AAT AGGT ATT ACAGGCTCTAACGGGAA 
AAC GAC AAC GAC AAC GAT GAT T G C AG AAG T C T T AAAT G C T GG AG G T C AG A 
GAGGTTTGTTAGCTGGGAATATCGGCTTTCCTGCTAGTGAAGTTGTTCAG 
GCTGCGAATGATAAAGATACTCTAGTTATGGAATTATCAAGTTTTCAGCT 
AATGGGAGTTAAGGAATTTCGTCCTCATATTGCAGTAATTACTAATTTAA 
TGCCAACTCATTTAGATTATCATGGGTCTTTTGAAGATTATGTTGCTGCA 
AAATGGAAT AT CCAAAAT CAAAT GT CTT CAT CT GAT TT TT TGGTACT TAA 
TTTTAATCAAGGTATTTCTAAAGAGTTAGCTAAAACTACTAAAGCaACAA 
TCGTTCCTTTCTCTACTACGGAAAAAGTTGATGGTGCTTACGTACAAGAC 
AAGCAACTT T TCT AT AAAGGGGAGAAT AT T ATGT C AGT AGAT GAC AT TGG 
T GT C C C AG G AAGC C AT AAC GT AG AGAAT GCT C T AG C AAC TAT TGCGGTTG 
CTAAACTGGCTGGT AT CAGT AAT C AAGT T ATT AGAGAAACTTTAAGC AAT 
TTTGGAGGTGTTAAACACCGCTTGCAATCACTCGGTAAGGTTCATGGTAT 
TAGtTTCTATAACGACAGCAAGTCAACTAATATATTGGCAACTCAAAAAG 
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CATTATCTGGCTTTGATAATACTAAAGTTATCCTAATTGCAGGAGGTCTT 
GAT CGCAGTAAT GAGT T T GATGAAT TGAT ACCAGAT AT CACTGGACTTAA 
ACATATGGTTGTTTTAGGGGAATCGGCATCTCGAGTAAAACGTGCTGCAC 
AAAAAG C AGG AGT AACT T AT AG C GAT G C T T T AGAT GT T AGAGAT G CGGT A 
CATAAAGCTTATGAGGTGGCACAACAGGGCGATGTTATCTTGCTAAGTCC 
T GC AAAT G CAT C AT GGG AC AT GT AT AAG AAT T T C G AAG T C C GT G GTG AT G 
AAT T CAT T GAT AC t T T C G AAAGT CT T AG AGGAG AG 

SEQ ID NO. 6710 

STRAIN 2603 

ggacgagtaatgaaaacaataacaacatttgaaaataaaaaagttttagt 
ccttggtttagcacgatctggagaagctgctgcacgtttgttagctaagt 
taggagcaatagtgacagttaatgatggcaaaccatttgatgaaaatcca 
acagcacagtctttgttggaagagggtattaaagtggtttgtggtagtca 
tcctttagaattgttagatgaggatttttgttacatgattaaaaatccag 
gaataccttataacaatcctatggtcaaaaaagcattagaaaaacaaatc 
cctgttttgactgaagtggaattagcatacttagtttcagaatctcagct 
aataggtattacaggctctaacgggaaaacgacaacgacaacgatgattg 
cagaagtcttaaatgctggaggtcagagaggtttgttagctgggaatatc 
ggctttcctgctagtgaagttgttcaggctgcgaatgataaagatactct 
agttatggaattatcaagttttcagctaatgggagttaaggaatttcgtc 
ctcatattgcagtaattactaatttaatgccaactcatttagattatcat 
gggtcttttgaagattatgttgctgcaaaatggaatatccaaaatcaaat 
gtcttcatctgattttttggtacttaattttaatcaaggtatttctaaag 
agttagctaaaactactaaagcaacaatcgttcctttctctactacggaa 
aaagttgatggtgcttacgtacaagacaagcaacttttctataaagggga 
gaatattatgtcagtagatgacattggtgtcccaggaagccataacgtag 
agaatgctctagcaactattgcggttgctaaactggctggtatcagtaat 
caagttattagagaaactttaagcaattttggaggtgttaaacaccgctt 
gcaatcactcggtaaggttcatggtattagtttctataacgacagcaagt 
caactaatatattggcaactcaaaaagcattatctggctttgataatact 
aaagttatcctaattgcaggaggtcttgatcgcggtaatgagtttgatga 
attgataccagatatcactggacttaaacatatggttgttttaggggaat 
cggcatctcgagtaaaacgtgctgcacaaaaagcaggagtaacttatagc 
gatgctttagatgttagagatgcggtacataaagcttatgaggtggcaca 
acagggcgatgttatcttgctaagtcctgcaaatgcatcatgggacatgt 
ataagaatttcgaagtccgtggtgatgaattcattgatactttcgaaagt 
cttagaggagag 

SEQ ID NO. 6711 

STRAIN 090 frame: 3 

ITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGIKWCGS 
HPLELLDEDFCYMIKNPGIPYNNPMVKKALEKQIPVLTEVELAYLVSESQLIGITGSNGK 
TTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAADDKDILVMELSSFQLMGVKEFRPHI 
AVITNLMPTHLDYHGSFEDYVAAKWNIQNQMSSSDFLVLNFNQGISKELAKTTKATIVPF 
STTEKVDGAYVQDKQLFYKGENIMLVDDIGVPGSHNVENALATIAVAKLAGISNQVIRET 
LSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLDRGNEFD 
ELIPDITGLKHMVVLGESASRVKRAAQKAGVTYSDALDVRDAVHKAYEVAQQGDVILLSP 
ANAS W DM YKN FEVRGDEFI DT FE S LRGE 

SEQ ID NO. 6712 

STRAIN A90 9 frame: 3 

ITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGIKVVCGS 
HPLELLDEDFCYMIKNPGIPYNNPMVKKALEKQIPVLTEVELAYLVSESQLIGITGSNGK 
TTTTTMIAEVLNAGGQRGLLAGNIGFPASEVVQAANDKDTLVMELSSFQLMGVKE FRPHI 
AVITNLMPTHLDYHGSFEDYVAAKWNIQNQMSSSDFLVLNFNQGISKELAKTTKATIVPF 
STTEBCVDGAYVQDKQLFYKGENIMSVDDIGVPGSHNVXNALATIAVAKLAGISNQVIRET 
LSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLDRGNEFD 
E L I P D I TG LKHM VVLGE S AS RVKRAAQKAGVT Y S DAL D VRD AVHK A YE VAQQG D V I L L S P 
ANAS WDM YKN FEVRGDEFI DT FE S LRGE 

SEQ ID NO. 6713 

STRAIN H3 6B frame: 1 

GRVMKT ITT FENKKVLVLGLARSGE AAARLLAKLGAI VT VN DGKP FDEN PTAQS LLEEG I 
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KWCGSHPLELLDEDFCYMIKNPGIPYNNPMVKKALEKQIPVLTEVELAYLVSESQLIGI 
TGSNGKTTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAANDKDTLVMELSSFQLMGVK 
EFRPHIAVITNLMPTHLDYHGSFEDYVAAKWNIQNQMSSSDFLVLNFNQGISKELAKTTK 
ATIVPFSTTEKVDGAYVQDKQLFYKGENIMSVDDIGVPGSHNVENALATIAVAKLAGISN 
QVIRETLSNFGGVKHRLQSLGKVHGISFYNDSK 

SEQ ID NO. 6714 

STRAIN 18RS21 frame: 1 

GRVMKTITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGI 
KWCGSHPLELLDEDFCYMIKNPGIPYNNPMVKKALEKQIPVLTEVELAYLVSESQLIGI 
T G S N G KT T T T TM I AE VLN AGG QRG L L AGN IGF PA S E V VQ AAN DKDT L VME L S S FQ LMG VK 
E FRPHI AVITNLMPTHLDYHGS FE DYVAAKWNIQNQMS S SDFLVLNFNQGI SKELAKTTK 
ATIVPFSTTEKVDGAYVQDKQLFYKGENIMSVDDIGVPGSHNVENALATIAVAKLAGISN 
QVIRETLSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLD 
RGNEFDELIPDITGLKHMWLGESASRVKRAAQKAGVTYSDALDVRDAVHKAYEVAQQGD 
VILLS PAN AS WDM YKN FEVRG DEFIDTFES LRGE 

SEQ ID NO. 6715 

STRAIN M732 frame: 1 

GRVMKTITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGI 
KWCGSHPLELLDEDFCYMIKNPGI PYNNPMVKKALEKQI PVLTEVELAYLVSESQLIGI 
TGSNGKTTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAADDKDILVMELSSFQLMGVK 
EFRPH I AVITNLMPTHLDYHGS FED YVAAKWNIQNQMSS SDFLVLNFNQGI SKELAKTTK 
ATIVPFSTTEKVDGAYVQDKQLFYKGENIMSVDDIGVPGSHNVENALATIAVAKLAGISN 
QVIRETLSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLD 
RGNEFDELIPDITGLKHMWLGESASRVKRAAQKAGVTYSDALDVRDAVHKAYEVAQQGD 
VILLS PANASWDMYKNFEVRGDEFIDT FES LRGE 

SEQ ID NO. 6716 

STRAIN COH1 frame: 1 

GRVMKTITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGI 
KWCGSHPLELLDEDFCYMIKNPGI PYNNPMVKKALEKQI PVLTEVELAYLVSESQLIGI 
TGSNGKTTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAADDKDILVMELSSFQLMGVK 
EFRPHIAVITNLMPTHLDYHGSFEDYVAAKWNIQNQMSS SDFLVLNFNQGI SKELAKTTK 
AT IVPFSTTE K V D G A Y VQ D KQ L F YKGEN I M S V D D I G V P G S HN VENAL AT I AVAKL AG I S N 
QVIRETLSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLD 
RGNEFDELIPDITGLKHMWLGESASRVKRAAQKAGVTYSDALDVRDAVHKAYEVAQQGD 
VILLS PAN AS W DM YKN FEVRG DE FI DT FE 

SEQ ID NO. 6717 

STRAIN M781 frame: 1 

GRVMKTITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGI 
KWCGSHPLELLDEDFCYMIKNPGI PYNNPMVKKALEKQI PVLTEVELAYLVSESQLIGI 
TGSNGKTTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAADDKDILVMELSSFQLMGVK 
EFRPHI AVI TNLMPTHLDYHGSFEDYVAAKWNIQNQMSS SDFLVLNFNQGI SKELAKTTK 
ATIVPFSTTEKVDGAYVQDKQLFYKGENIMSVDDIGVPGSHNVENALATIAVAKLAGISN 
QVIRETLSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLD 
RGNE FDEL I PD I TGLKHMVVLGE S ASRVKRAAQKAGVT Y S DAL DVRDAVHKAYE VAQQG D 
VILLS P ANAS WDM YKN FEVRGDE FI DT FE S LRGE 

SEQ ID NO. 6718 

STRAIN CJB110 frame: 1 

GRVMKTITTFENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGI 
KWCGSHPLELLDEDFCYMIKNPGI PYNNPMVKKALEKQI PVLTEVELAYLVSESQLIGI 
TGSNGKTTTTTMIAEVLNAGGQRGLLAGNIGFPASEVVQAADDKDILVMELSSFQLMGVK 
EFRPHI AVI TNLMPTHLDYHGSFEEYVAAKWNIQNQMSS SDFLVLNFNQGI SKELAKTTK 
ATIVPFSTTEKVDGAYVQDKQLFYKGENIMLVDDIGVPGSHNVENALATIAVAKLAGISN 
QVIRETLSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLD 
RGNE FDELIPDI TGLKHMVVLGE S AS RVKRAAQKAGVT Y S DAL DVRDAVHKAYE VAQQG D 
VILLS PANASWDMYKNFEVRGDEFIDT FES LRGE 

SEQ ID NO. 6719 

STRAIN 1169NT frame: 3 

I TT FENKKVLVLGL ARS GEAAARLL AKLGAI VT VN DGKP FDEN PT AQS L LEE GIKVVCGS 
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HPLELLDEDFCYMIKNPGIPYNNPMVKKALEKQIPVLTEVELAYLVSESQLIGITGSNGK 
TTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAADDKDTLVMELSSFQLMGVKEFRPHI 
AVITNLMPTHLDYHGSFEDYVAAKWNIQNQMSSSDFLVLNFNQGISKELAKTTKATIVPF 
STTEKVDGAYVQDKQLFYKGENIMSVDDIGVPGSHNVENALATIAVAKLAGISNQVIRET 
LSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLDRGNEFD 
ELIPDITGLKHMWLGESASRVKRAAQKAGVTYSNALDVRDAVHKAYEVAQQGDVILXSP 
AN AS W DM YKN FE VRG DE F I D T F 

SEQ ID NO. 6720 

STRAIN JM9130013 frame: 1 

GRVMKT ITT FENKKV L VL G L AR S GE AAAR L L AKLG A I VT VN D GK P F DE N P T AQ S L LE E G I 
KWCGSHPLELLDEDFCYMIKNPGIPYNNPMVKKALEKQIPVLTEVELAYLVSESQLIGI 
TGSNGKTTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAANDKDTLVMELSSFQLMGVK 
EFRPHIAVITNLMPTHLDYHGSFEDYVAAKWNIQNQMSSSDFLVLNFNQGISKELAKTTK 
AT I VP FS T TEKVDGAYVQ DKQL F YKGEN IMS VD D I GVPGS HN VENAL AT I AVAKLAG I SN 
QVIRETLSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLD 
RSNEFDELIPDITGLKHMWLGESASRVKRAAQKAGVTYSDALDVRDAVHKAYEVAQQGD 
VILLSPANASWDMYKNFEVRGDEFIDTFESLRGE 

SEQ ID NO. 6721 

STRAIN 2 603 frame: 1 

GRVMKT ITT FENKKVLVLGLARSGEAAARLLAKLGAIVTVNDGKPFDENPTAQSLLEEGI 
KWCGSHPLELLDEDFCYMIKNPGIPYNNPMVKKALEKQIPVLTEVELAYLVSESQLIGI 
TGSNGKTTTTTMIAEVLNAGGQRGLLAGNIGFPASEWQAANDKDTLVMELSSFQLMGVK 
EFRPHIAVITNLMPTHLDYHGSFEDYVAAKWNIQNQMSSSDFLVLNFNQGISKELAKTTK 
AT IVPFSTTE KV D GA Y VQ DKQL F YKGE NIMSVDDIGVPGS HN VENAL AT I AVAKLAG I S N 
QVIRETLSNFGGVKHRLQSLGKVHGISFYNDSKSTNILATQKALSGFDNTKVILIAGGLD 
RGNEFDELIPDITGLKHMVVLGESASRVKRAAQKAGVTYSDALDVRDAVHKAYEVAQQGD 
VILLSPANASWDMYKNFEVRGDEFIDTFESLRGE 

SEQ ID NO. 6801 
STRAIN 2603 

AT GGC T AAAG AG AGG GT AG AT GT T C T T G C C T AT AAAC AG GG AC T T T T T GAT AC AC GAG AG 
CAAGCGAAACGT GGTGT T AT GGC AGGAAT GGTGAT T AAC GT T AT CAAT GGAGAACGT TAT 
GAT AAAC C AGGT G AAAAG G T T G C AG AC GAT AC T G AAT T AAAAC T AAAAGGT G AAAAAC T A 
AAATAT GT T AGT AGAGGT GGATTGAAATTAGAAAAAGCTTT AC AAGTTT TT GAAATT T CA 
GTTGCAGATAAGCTAACTATAGATATTGGCGCCTCTACGGGTGGTTTTACTGATGTTATG 
CT AC AAT C AG G AGC G CGT T T AG T T T AC G C AGT AG AT GT AGG AAC AAAT CAAT T AGT T T G G 
AAGT T AC GT C AGGAT CAT CGTGTTCGTT CT AT G GAAC AAT AT AAT T T T AGGT AT G C C C AA 
AAAGAAGATTTCAAGGAGGGACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCT 
CTTAATTTGATTTTACCAGCTCTAAAAGAAATTTTAGTGGATGGTGGACAAGTAGTGGCA 
T T AAT T AAAC C AC AAT T T GAAG C AG GT CGT GAG C AAAT T G G T AAAAAT G GT AT T G T C AAA 
GACAAGTTGGTTCATGAAAAGGTTTTGACAACAGTGACCAATTTCACGAAAGATTATGGA 
TATACGGTTAAACATCTTGATTTTTCGCCCATTCAAGGTGGACATGGAAATATTGAGTTT 
TTAAT GCAT TT GCAAAAGT GT C AAGAT C CACAAAAT CTT GTGCT TGACC AAAT AC AAGAT 
GT T AT AG AAAAAG C AC AT AAGGAAT T TAAG AAAAAT GAAG AAG AG 

SEQ ID NO. 6802 

STRAIN 090 

GCTAAAGAGAGGGTAGATGTTCTTGCCT 

AT AAAC AG GG AC T T T T T GAT AC AC GAG AG C AAG C G AAAC GT GGTGT TAT G 
G C AG G AAT G GT GAT T AAC GT TAT CAAT G GAG AAC G T T AT GAT AAAC C AG G 
T G AAAAG GT T G C AG AC GAT AC T G AAT T AAAAC T AAAAGGT G AAAAAC TAA 
AATATGTTAGTAGAGGTGGATTGAAATTAGAAAAAGCTTTACAAGTTTTT 
GAAATTTCAGTTGCAGATAAGCTAACTATAGATATTGGCGCCTCTACGGG 
TGGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAG 
TAG AT GT AGG AAC AAAT C AAT T AGT T T G GAAG T T ACGT C AG GAT CAT CGT 
GTTCGTTCTATGGAACAATATAATTTTAGGTATGCCCAAAAAGAAGATTT 
C AAGG AGGG AC T G C C T G AAT T T G C AT CG AT AG AT G T C T CAT T T AT C T C T C 
TTAATTTGATTTTACCAGCTCTAAAAGAAATTTTAGTGGATGGTGGACAA 
GT AGT GG CAT T AAT T AAAC C AC AAT T T GAAG C AGGT CGT GAG C AAAT T G G 
T AAAAAT GGT AT T GT C AAAG AC AAG T T G GT T CAT G AAAAGG T T T T G AC AA 
C AGT G AC CAAT T T C AC G AAAG AT TAT G GAT AT ACG G T T AAAC AT CTT GAT 
TTTTCGCCCATTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTT 
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G C AAAAGT GT C AAG AT C C AC AAAAT CTTGTGCTT G AC CAAAT AC AAGAT G 
TT ATAGAAAAAGC ACATAAGGAATTT AAGAAAAAT GAAGAAGAG 

SEQ ID NO. 6803 

STRAIN A909 

G C T AAAG AGAG GGT AG AT GT T C T T G C CT A 

TAAACAGGGACTTT TTGATACACGAGAGCAAGCGAAACGTGGT GT TAT GG 
CAGGAATGGTGATTAACGTTATCAATGGAGAACGTTATGATAAACCAGGT 
GAAAAGGT T GC AGAC GAT ACT G AATT AAAACT AAAAGGT G AAAAAC T AAA 
AT AT GT TAG TAG AG G T GG AT T G AAAT T AG AAAAAG C T T T AC AAGT T T T T G 
AAAT T T C AGT T G C AG AT AAG CT AACT AT AGAT AT T G G C GCC T CT ACGGG T 
GGT T T TACT GAT GT T AT GC T AC AAT C AGG AG C G C GT T T AGT T T ACG C AGT 
AGATGTAGGAACAAATCAATTAGTTTGGAAGTTACGTCAGGATCATCGTG 
T T C GT T CT AT G G AAC AAT AT AAT T T T AG GT AT GCC C AAAAAG AAG AT T T C 
AAGGAGGGACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCTCT 
T AAT T T GAT T T T AC C AGC T CT AAAAG AAAT T T T AGT G GAT G GT GG AC AAG 
T AGT GGCATT AATT AAAC CACAAT TTGAAGCAGGTCGT GAGCAAATT GGT 
AAAAAT G GT AT T GT C AAAGAC AAGT T GGT T CAT G AAAAG GT T T T GAC AAC 
AGT G AC C AAT T T C AC G AAAGAT T AT GG AT AT ACGGT T AAAC AT C T T GAT T 
TTTCGCCCATTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTTG 
C AAAAGT GT C AAG AT C C AC AAAAT CTTGTGCTT GAC C AAAT AC AAGAT GT 
TATAGAAAAAGCACATAAGGAATTTAAGAAAAATGAAGAAGAG 

SEQ ID NO. 6804 

STRAIN H36B 

G C T AAAG AG AG G GT AG AT GT T C T T G C CT AT AAAC AGG 
GAC T T T T T GAT AC AC GAG AGC AAG C GAAAC GT GG T GT TAT G G C AG G AAT G 
GTGATTAACGTTATCAATGGAGAACGTTATGATAAACCAGGTGAAAAGGT 
T G C AG AC GAT AC T G AAT T AAAAC T AAAAGGT GAAAAACT AAAAT AT G T T A 
GT AG AGGT G GAT T G AAAT TAG AAAAAG C T T T AC AAGT T T T T G AAAT T T C A 
GTTGCAGATAAGCTAACTATAGATATTGGCGCCTCTACGGGTGGTTTTAC 
TGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAGTAGATGTAG 
G AAC AAAT C AAT T AGT T T GGAAGT T AC G T C AGG AT CAT CGTGTTCGTTCT 
AT GG AAC AAT AT AAT T T T AG GT AT GCC C AAAAAG AAG AT T T C AAGGAGG G 
ACT GC CT GAAT TTGCAT CGAT AGAT GT CT CAT T TAT CT CT CT T AAT TTGA 
T T T T AC C AG C T C T AAAAG AAAT T T T AGTG GAT GGT GGAC AAGT AGT GG C A 
T T AAT T AAAC CACAAT T TGAAGC AGGT CGT GAG CAAAT T GGTAAAAAT GG 
TAT T GT C AAAG AC AAGT T GGT T CAT GAAAAGGT T T T GAC AAC AGT GAC C A 
AT T T C AC GAAAG AT T AT G GAT AT AC GGT T AAAC AT C T T GAT TTTTCGCCC 
AT T C AAG GT G GAC AT GG AAAT AT T G AGT T T T T AAT G C AT T T G C AAAAGT G 
T C AAG AT C C AC AAAAT CT T G T G C T T GAC C A^AT AC AAG AT GT T AT AG AAA 
AAG C AC AT AAGG AAT T T AAG AAAAAT GAAGAAGAG 

SEQ ID NO. 6805 

STRAIN 18RS21 

GCTAAAGAGAGGGTAGATGTTCTTGCCTA 

T AAAC AG G GAC T T T T T GAT AC AC GAG AG C AAG C G AAAC GT GGT GT TAT G G 
C AGG AAT GGT G AT T AAC GT T AT C AAT GG AG AAC G T T AT GAT AAAC C AG GT 
G AAAAG G T T G C AG AC GAT ACT GAAT T AAAAC T AAAAGGT G AAAAAC T AAA 
AT AT GT T AGT AG AG GT G GAT T G AAAT T AG AAAAAGCT T T AC AAGT T T T T G 
AAAT T T C AGT T G C AG AT AAG C T AAC TAT AG AT AT TGGCGCCT C T AC GG G T 
GGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAGT 
AGAT GT AGG AAC AAAT C AAT T AGT T T G G AAGT T AC G T C AG GAT CAT C GT G 
TTCGTTCTATGGAAC AAT AT AATTTT AGGT ATGCCC AAAAAG AAG ATT TC 
AAGGAGGGACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCTCT 
T AAT T T GAT T T T AC C AG C T C T AAAAG AAAT T T T AGT G GAT G GT G GAC AAG 
T AGT GGCATT AATT AAAC CACAAT TTGAAGCAGGTCGT GAGCAAATT GGT 
AAAAATGGT ATT GT CAAAGAC AAGTTGGT T C ATGAAAAGGT T T T GAC AAC 
AG T GAC C AAT T T C AC GAAAG AT TAT GG AT AT ACG GT T AAAC AT C T T G AT T 
TTTCGCCCATTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTTG 
C AAAAGT GT C AAG AT C C AC AAAAT CT T GT G C T T GAC CAAAT AC AAG AT GT 
TATAGAAAAAGCACATAAGGAATTTAAGAAAAATGAAGAAGAG 

SEQ ID NO. 6806 
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STRAIN M732 

GCTAAAGAGAGGGTAGATGTTCTTGCCTA 

TAAACAGGGACTTTTTGATACACGAGAGCAAGCGAAACGTGGTGTTATGG 
CAGGACTGGTGATTAACGTTATCAATGGAGAACGTTATGATAAACCAGGC 
GAAAAGGTTGCAGACGATACTGAATTAAAACTAAAAGGTGAAAAACTAAA 
ATATGTTAGTAGAGGTGGATTGAAATTAGAAAAAGCTTTACAAGTTTTTG 
AAATTTCAGTTGCAGATAAGCTAACTATAGATATTGGCGCCTCTACGGGT 
GGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAGT 
AGAT GTAGGAACAAAT CAAT TAGTT T GGAAGTT ACGT CAGGAT CAT CGTG 
TTCGTTCTATGGAACAATATAATTTTAGGTATGCCCAAAAAGAAGATTTC 
AAG G AGG GAC T G C C T G AAT T T GC AT C GAT AG AT GT C T CAT T TAT CT C T CT 
TAATTTGATTTTACCAGCTCTAAAAGAAATTTTAGTGGATGGTGGACAAG 
TAGTGGCATTAATTAAACCACAATTTGAAGCAGGTCGTGAGCAAATTGGT 
AAAAAT GG TAT T GT C AAAGAC AAGT T GGT T CAT G AAAAGG T T T T G AC AAC 
AGT G AC CAAT T T C ACGAAAG AT TAT G GAT AT AC GGT T AAAC AT C T T GAT T 
TTTCGCCCGTTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTTG 
C AAAAGT GT C AAG AT C C AC AAAAT CTTGTGCTT GAC C AAAT AC AAG AT GT 
TATAGAAAAAGCACATAAGGAATTTAAGAAAAATGAAGAAGAG 

SEQ ID NO. 6807 

STRAIN COH1 

GCTAAAGAGAGGGTAGATGTTCTTGCCT 

AT AAAC AGG GAC T T T T T GAT AC AC GAG AG C AAG C G AAAC GT G GT G T TAT G 
G C AGG AC T G GT GAT T AAC GT TAT CAAT GG AG AAC G T T AT GAT AAAC C AG G 
C G AAAAGGT T G C AGAC GAT AC T GAAT T AAAAC T AAAAG GT G AAAAAC T AA 
AATATGTTAGTAGAGGTGGATTGAAATTAGAAAAAGCTTTACAAGTTTTT 
GAAAT T T C AGT T G C AG AT AAG C T AACT AT AGAT AT TGGCGCCTC T AC GG G 
TGGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAG 
T AGAT GTAGGAAC AAAT CAATTAGTTT GGAAGTT ACGT CAGGAT CAT CGT 
G T T CGT T C T AT G G AAC AAT AT AAT T T T AGGT AT G C C C AAAAAG AAG ATT T 
C AAG GAG G GAC T GC C T GAAT T T G C AT C GAT AG AT GT C T CAT T T AT C T CT C 
TT AAT TTG AT TTT AC CAGCTCT AAAAG AAAT TTT AGT GGATGGTGGACAA 
GTAGTGGCATTAATTAAACCACAATTTGAAGCAGGTCGTGAGCAAATTGG 
TAAAAATGGTATTGTCAAAGACAAGTTGGTTCATGAAAAGGTTTTGACAA 
C AGT GAC CAAT T T C AC GAAAG AT T AT G G AT AT ACGGT T AAAC AT C T T GAT 
TTTTCGCCCGTTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTT 
G C AAAAGT GT C AAG AT C C AC AAAAT CTTGTGCTT GAC C AAAT AC AAG AT G 
T TAT AG AAAAAG C AC AT AAGG AAT T T AAGAAAAAT G AAG AAG AG 

SEQ ID NO. 6808 

STRAIN M781 

GCTAAAGAGAGGGTAGATGTTCTTGCCT 

AT AAAC AGG GAC T T T T T GAT AC ACG AG AG C AAG C G AAAC G T G GT GT TAT G 
G C AGG ACT G GT GAT T AACG T T AT CAAT G G AGAAC G T T AT GAT AAAC C AG G 
CGAAAAGGTTGC AG ACG AT ACT G AATTAAAACT AAAAGGT GAAAAACTAA 
AAT AT GT T AGT AG AGGT GG AT T GAAAT T AG AAAAAGC T T T AC AAGT T TT T 
GAAATTTCAGTTGCAGATAAGCTAACTATAGATATTGGCGCCTCTACGGG 
TGGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAG 
TAG AT GTAGGAAC AAAT C AAT T AGT T T G G AAG T T ACGT CAGGAT CAT CGT 
GTTCGTTCTATGGAACAATATAATTTTAGGTATGCCCAAAAAGAAGATTT 
CAAGGAGGGACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCTC 
T T AAT T T GAT T T T AC C AG C T C T AAAAG AAAT T T T AGT G GAT GGT GG AC AA 
GTAGTGGCATTAATTAAACCACAATTTGAAGCAGGTCGTGAGCAAATTGG 
T AAAAAT GGT ATTGTC AAAGAC AAGT T GGT TC AT G AAAAGGT TTT G AC AA 
CAGTGACCAATTTCACGAAAGATTATGGATATACGGTTAAACATCTTGAT 
TTTTCGCCCGTTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTT 
G C AAAAGT GT C AAG AT C C AC AAAAT CTTGTGCTT GAC C AAAT AC AAG AT G 
T T AT AGAAAAAG C AC AT AAGGAAT T T AAGAAAAAT G AAG AAGAG 

SEQ ID NO. 6809 

STRAIN CJB110 

GCTAAAGAGAGGGTAGATGTTCTTGCCTA 

T AAAC AG G G AC T T T T T GAT AC AC GAG AG C AAG CGAAACGT G GT GT T AT G G 
CAGGAATGGTGATTAACGTTATCAATGGAGAACGTTATGATAAACCAGGT 
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GAAAAG GT T GC AG AC GAT AC T G AAT T AAAAC T AAAAG GT G AAAAACT AAA 
AT AT GT T AGT AGAG GT G GAT T G AAAT T AG AAAAAG C T T T AC AAGT T T T T G 
AAATTTCAGTTGCAGATAAGCTAACTATAGATATTGGCGCCTCTACGGGT 
GGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAGT 
AGAT GT AG G AAC AAAT C AAT T AGT T T G G AAG T T AC GT C AG GAT CAT C GT G 
T T C GT T C T AT GG AAC AAT AT AAT T T T AG GT AT GC C C AAAAAG AAG AT T T C 
AAGG AG GG ACT G C CT GAAT T T G CAT C GAT AG AT GT C T CAT T T AT C T CT C T 
T AAT T T GAT T T T AC C AG CT CT AAAAGAAAT T T T AG T G G AT GG T G G AC AAG 
T AGT GG C AT T AAT T AAAC C AC AAT T T GAAG C AGGT CG T G AGC AAAT T G G T 
AAAAAT G GT AT T GT C AAAGAC AAGT T GG T T CAT GAAAAG GT T T T G AC AAC 
AGT G AC C AAT T T C AC G AAAG AT T AT GG AT AT ACG GT T AAAC AT C T T GAT T 
TTTCGCCCATTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTTG 
C AAAAGT GT C AAG AT C C AC AAAAT CTTGTGCTT G AC C AAAT AC AAG AT G T 
TATAGAAAAAGCACATAAGGAATTTAAGAAAAATGAAGAAGAG 

SEQ ID NO. 6810 

STRAIN 1169NT 

GCTAAAGAGAGGGTAGATGTTCTTGCCTA 

TAAACAGGGACTTTTTGATACACGAGAGCAAGCGAAACGTGGTGTTATGG 
C AGG ACT G GT G AT T AAC GT T AT C AAT GGAGAACG T TAT GAT AAAC C AG GC 
GAAAAG GT T G C AG AC GAT AC T GAAT T AAAAC T AAAAGGT G AAAAAC T AAA 
AT AT GT T AGT AG AGGT GG AT T GAAAT TAG AAAAAG C T T T AC AAGT T TT T G 
AAAT T T C AGT T G C AG AT AAG CT AAC TAT AG AT AT TGGCGCCT CT AC G GGT 
GGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAGT 
AG AT GT AGG AAC AAAT C AAT T AGT T T G G AAGT T AC G T C AG GAT CAT C GT G 
T T C GT T C TAT GG AAC AAT AT AAT T T T AGGT AT GC C C AAAAAG AAG AT T T C 
AAGGAGGGACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCTCT 
TAATTTGATTTTGCCAGCTCTAAAAGAAATTTTAGTGGATGGTGGACAAG 
T AGT GGC AT T AAT T AAAC C AC AAT T T GAAG C AG G T C G T G AG C AAAT T GG T 
AAAAAT G GT AT T GT C AAAG AC AAGT T GGT T CAT G AAAAGGT T T T G AC AAC 
AG T GAC C AAT T T C ACG AAAG AT TAT G GAT AT AC G GT T AAAC AT C T T GAT T 
TTTCGCCCATTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTTG 
CAAAAGTGTCAAGATCCACAAAATCTTGTGCTTGACCAAATACAAGATGT 
TATAGAAAAAGCACATAAGGAATTTAAGAAAAATGAAGAAGAG 

SEQ ID NO. 6811 

STRAIN JM9130013 
GCTAAAGAGAGGGTAGATGTTCTTGCCTA 

T AAAC AGG GAC TT T TT GAT AC ACG AG AG C AAG C G AAAC G T GG T GT T AT GG 
C AG GAAT GGT GAT T AAC GT TAT C AAT GG AG AAC G T T AT GAT AAAC C AGGT 
G AAAAGGT T G C AG ACGAT ACT G AAT T AAAAC T AAAAG G T G AAAAAC T AAA 
AT AT GT T AGT AG AG GT GG AT T GAAAT TAG AAAAAG C T T T AC AAG T T T T T G 
AAAT T T C AGT T G C AG AT AAGC T AACT AT AG AT AT T GG C G C CT C T AC GGG T 
GGTTTTACTGATGTTATGCTACAATCAGGAGCGCGTTTAGTTTACGCAGT 
AGAT GT AGGAACAAAT CAATTAGTT T GGAAGTT ACGT C AGGAT CAT CGT G 
TTCGTTCTATGGAACAATATAATTTTAGGTATGCCCAAAAAGAAGATTTC 
AAGGAGGGACTGCCTGAATTTGCATCGATAGATGTCTCATTTATCTCTCT 
T AAT T T GAT T T T AC C AG CT C T AAAAG AAAT T T T AGT GG AT G GT G GAC AAG 
TAGTGGCATTAATTAAACCACAATTTGAAGCAGGTCGTGAGCAAATTGGT 
AAAAATGGT AT TGT C AAAG AC AAGTTGGTTC AT G AAAAGGT TTT GAC AAC 
AG T GAC C AAT T T C AC G AAAG AT TAT G GAT AT AC GGT T AAAC AT C T T GAT T 
TTTCGCCCATTCAAGGTGGACATGGAAATATTGAGTTTTTAATGCATTTG 
C AAAAGT GT C AAG AT C C AC AAAAT CTTGTGCTT G AC C AAAT AC AAG AT GT 
TATAGAAAAAGCACATAAGGAATTTAAGAAAAATGAAGAAGAG 

SEQ ID NO. 6812 
STRAIN 2 603 frame: 1 

MAKERVDVLAYKQGLFDTREQAKRGVMAGMVINVINGERYDKPGEKVADDTELKLKGEKLK 

YVSRGGLKLEKALQVFEISVADKLTIDIGASTGGFTDVMLQSGARLVYAVDVGTNQLVWK 

LRQDHRVRSMEQYNFRYAQKEDFKEGLPEFASIDVSFISLNLILPALKEILVDGGQVVAL 

IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 

MHLQKCQDPQNLVLDQIQDVIEKAHKEFKKNEEE 

SEQ ID NO. 6813 
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STRAIN 0 90 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGMVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVSRGGLKLEKALQVFEISVADKLTIDIGASTGGFTDVMLQSGARLVYAVDVGTNQLVWK 
LRQDHRVRSMEQYNFRYAQKEDFKEGLPEFASIDVSFISLNLILPALKEILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKEFKKNEEE 

SEQ ID NO. 6814 

STRAIN A909 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGMVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVSRGGLKLEKALQVFEISVADKLTIDIGASTGGFTDVMLQSGARLVYAVDVGTNQLVWK 
LRQ DHR VR S ME Q YN FR YAQKE D FKE GL P E FAS I D V S F I S LN L I L P ALKE I L VD G GQ WAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 
MHLQKCQD PQNL VLDQ I QDVI EKAHKE FKKNEEE 

SEQ ID NO. 6815 

STRAIN 18RS21 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGMVINVINGERYDKPGEKVADDTELKLKGEKLK 
WSRGGLKLEKALQVFEISVADKLTIDIGASTGGFTDVMLQSGARLVYAVDVGTNQLVWK 
LRQDHRVRSMEQYNFRYAQKEDFKEGLPEFASIDVSFISLNLILPALKEILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKE FKKNEEE 

SEQ ID NO. 6816 

STRAIN M732 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGLVINVINGERYDKPGEKVADDTELKLKGEKLK 
Y VSRGGLKLEKALQVFE I S VADKLT I D I GAS TGGFT DVMLQS GARLVYAVDVGTNQLVWK 
LRQDHRVRSMEQYNFRYAQKE D FKEGLPE FAS I DVS FI S LNL I LPALKE ILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPVQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKE FKKNEEE 

SEQ ID NO. 6817 

STRAIN COH1 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGLVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVSRGGLKLEKALQVFEISVADKLTIDIGASTGGFTDVMLQSGARLVYAVDVGTNQLVWK 
LRQDHRVRSMEQYNFRYAQKEDFKEGLPEFASIDVSFISLNLILPALKE ILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPVQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKE FKKNEEE 

SEQ ID NO. 6818 

STRAIN M781 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGLVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVSRGGLKLEKALQVFEISVADKLTIDIGASTGGFTDVMLQSGARLVYAVDVGTNQLVWK 
L RQ DHR VR S ME Q YN FR YAQKE D FKE G L PE FAS I D V SFISLNLIL P ALKE I L V DG GQ WAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPVQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKE FKKNEEE 

SEQ ID NO. 6819 

STRAIN CJB110 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGMVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVSRGGLKLEKALQVFE I S VADKLT I DIGASTGGFT DVMLQS GARLVYAVDVGTNQLVWK 
L RQ D HRVR S ME Q YNFR YAQKE D FKEGLPE FAS I DVS FI SLN LI LPALKE ILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKE FKKNEEE 

SEQ ID NO. 6820 

STRAIN 1169NT frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGLVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVS RGGLKLEKALQVFE I S VADKLT I DI GASTGGFT DVMLQS GARLVYAVDVGTNQLVWK 
LRQDHRVRSMEQYNFRYAQKE DFKEGLPEFAS I DVS FISLNLILPALKEILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKE FKKNEEE 

SEQ ID NO. 6821 
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STRAIN JM9130013 frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGMVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVS RGGLKLEKALQVFE I S VADKLT I D I GAS T GG FT DVMLQS GARLV YAVDVGTNQLVWK 
LRQDHRVRSMEQYNFRYAQKEDFKEGLPEFASIDVSFISLNLILPALKEILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKEFKKNEEE 
\ 

SEQ ID NO. 6822 

STRAIN H3 6B frame: 1 

AKERVDVLAYKQGLFDTREQAKRGVMAGMVINVINGERYDKPGEKVADDTELKLKGEKLK 
YVSRGGLKLEKALQVFEISVADKLTIDIGASTGGFTDVMLQSGARLVYAVDVGTNQLVWK 
LRQDHRVRSMEQYNFRYAQKEDFKEGLPEFASIDVSFISLNLILPALKEILVDGGQWAL 
IKPQFEAGREQIGKNGIVKDKLVHEKVLTTVTNFTKDYGYTVKHLDFSPIQGGHGNIEFL 
MHLQKCQDPQNLVLDQIQDVIEKAHKEFKKNEEE 

SEQ ID NO. 6901 
STRAIN 2603 

ATGAATAAAAAGGTACTATTGACATCGACAATGGCAGCTTCGCTATTATCAGTCGCAAGT 
GTTCAAGCACAAGAAACAGATACGACGTGGACAGCACGTACTGTTTCAGAGGTAAAGGCT 
GAT T T GGTAAAGCAAG ACAATAAAT CAT CAT AT ACT GT GAAAT ATGGT GAT ACACT AAGC 
GT T AT T T C AGAAG C AAT GT C AAT T GAT AT GAAT GT CT T AGC AAAAAT AAAT AAC AT TG C A 
GATATCAATCTTATTTATCCTGAGACAACACTGACAGTAACTTACGATCAGAAGAGTCAT 
AC T G C C ACT T C AAT GAAAAT AG AAAC AC C AG C AAC AAAT GCTGCTGGT C AAAC AAC AG C T 
ACTGTGGATTTGAAAACCAATCAAGTTTCTGTTGCAGACCAAAAAGTTTCTCTCAATACA 
AT T T C G G AAG G T AT G AC AC C AG AAG C AG C AAC AAC GAT TGTTTCGC C AAT G AAG AC AT AT 
TCTTCTGCGC C AG C T T T GAAAT C AAAAG AAG T AT TAG C AC AAG AGC AAG C T GT T AGT C AA 
G C AG C AGC T AAT G AAC AG GT AT C AC C AGC T C C T G T G AAGT C GAT T AC T T C AG AAGT T C C A 
G C AG C T AAAG AGG AAGT T AAAC C AACT CAG AC GT C AG T C AGT C AG T C AAC AAC AG TAT C A 
CCAGCTTCTGTTGCCGCTGAAACACCAGCTCCAGTAGCTAAAGTAGCACCGGTAAGAACT 
GTAGCAGCCCCTAGAGTGGCAAGTGTTAAAGTAGTCACTCCTAAAGTAGAAACTGGTGCA 
TCACCAGAGCATGTATCAGCTCCAGCAGTTCCTGTGACTACGACTTCACCAGCTACAGAC 
AGT AAGT T AC AAG C GAC T G AAG T T AAGAG CGT T C C G GT AG C AC AAAAAG C T C C AAC AG C A 
AC AC C G G T AG C AC AAC CAG C T T C AAC AAC AAAT G C AGT AG CT G C AC AT C C T GAAAAT G C A 
GGGCTCCAACCTCATGTTGCAGCTTATAAAGAAAAAGTAGCGTCAACTTATGGAGTTAAT 
GAAT T CAG T AC AT AC C G T G C GG G AG AT C C AG GT G AT CAT G G T AAAG G T T TAG C AGT T GAC 
TTTATTGTAGGTACTAATCAAGCACTTGGTAATAAAGTTGCACAGTACTCTACACAAAAT 
AT GG CAG C AAAT AAC AT TT C AT AT GT TAT CT GG C AAC AAAAGT T T T AC T C AAAT AC AAAC 
AGTATTTATGGACCTGCTAATACTTGGAATGCAATGCCAGATCGTGGTGGCGTTACTGCC 
AAC C AC TAT GAC C AC GT T C AC G T AT CAT T T AAC AAAT AAT AT AAAAAAGG AAGC TAT T T G 
GCTTCTTTTTTATATGCCTTGAATAGACTTTCAAGGTTCTTATATAATTTTTATTA 

SEQ ID NO. 6902 

STRAIN 0 90 

T GAG AC AAC AC T GAC AGT AAC T T AC GAT C AG AAGAGT CAT AC T G C C AC TT 
C AAT GAAAAT AG AAAC AC CAG C AAC AAAT GCTGCTGGT C AAAC AC CAG C T 
ACTGTGGATTTGAAAACCAATCAAGTTTCTGTTGCAGACCAAAAAGTTTC 
T C T C AAT AC AAT T T C G G AAGG T AT GAC AC C AGAAG C AGC AAC AAC GAT T G ' 
TTTCGCCAATGAAGACATATTCTTCTGCGCCAGCTTTGAAATCAAAAGAA 
GTATTAGCACAAGAGCAAGCTGTTAGTCAAGCAGCAGCTAATGAACAGGT 
AT C AAC AG C T C C T GT G AAG T C GAT T AC T T CAG AAG T T C CAG CAG C T AAAG 
AG G AAG T T AAAC C AAC T C AG ACGT CAG T C AGT CAG T C AAC AAC AG TAT C A 
CCAGCTTCTGTTGCCGCTGAAACACCAGCTCCAGTAGCTAAAGTAGCACC 
G G T AAG AAC T GT AG CAG C C C C TAG AGT GG C AAG T GT T AAAG TAG T C AC T C 
C T AAAG TAG AAAC T G G T G CAT C AC CAG AG CAT G T AT CAG C T C CAG CAG T T 
C C T G T GAC TAG GAC T T C AAC AG C T AC AG AC AG T AAG T T AC AAG C GAC T G A 
AGTTAAGAGCGTTCCGGTAGCACAAAAAGCTCCAACAGCAACACCGGTAG 
C AC AAC CAG CT T C AAC AAC AAAT G CAG TAG C T G C AC AT C C T GAAAAT G C A 
GGGCTCCAACCTCATGTTGCAGCTTATAAAGAAAAAGTAGCGTCAACTTA 
T G GAG T T AAT GAAT T C AGT AC AT AC C GT G C AG GT GAT C CAG GT G AT CAT G 
GTAAAGGTTTAGCAGTCGACTTTATTGTAGGTAAAAACCAAGCACTTGGT 
AAT G AAGT T G C AC AG TACT C T AC AC AAAAT AT G G C AG C AAAT AAC AT T T C 
AT AT GT TAT c T G G C AAC AAAAGT T T T ACT C AAAT AC AAAT AG TAT T TAT G 
GACCTGCTAATACTTGGAATGCAATGCCAGATCGTGGTGGCGTTACTGCC 
AAC CAT TAT GAC CAT G T T C AC G T AT CAT T T AAC AAAT AAT AT AAAAAAG G 
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AAGCTATTTGGCTTCTTTTTTATATGCCTTGAATAGACTTTCAAGGTTCT 
T AT AT AAT T TTTAT T A 

SEQ ID NO. 6903 

STRAIN A909 

CTGATTTGGTAAAGCAAGACAATAAATCATCATATACTGTGAA 
AT AT GGT GAT AC ACT AAG CGT T AT T T C AG AAGC AAT GT C AAT T GAT AT G A 
AT GT C T T AG C AAAAAT T AAT AAC AT T G C AG AT AT C AAT C T TAT T T AT C c T 
G AG AC AAC ACT G a C AGT AACT T AC GAT C AG AAG AGT C AT ACT G CT AC T T C 
AAT G AAAAT AG AAAC AC C AG C AAC AAAT GCTGCTGGT C AAAC Aa C AG c T A 
CTGTCGATTTGAAAACCAATCAAGTTTCTGTTGCAGACCAAAAAGTTTCT 
CTC AATAC AAT T T CGGAAGGTAT GACACCAGAAGCAGCAAC AACGAT TGT 
T T CG C C AAT GAAG AC AT AT TCTTcTGCGC C AG CT T T GAAAT C AAAAG AAG 
TAT T Ag C AC AAG G G C a AG C T GT TAG T C AAGC AG C AG C T AAT G AAC AG GT A 
T C Ac C AG C T c C T GT G AAGT C GAT TACT T C AG AAGT T C C Ag C AG C T AAAG A 
G GAAG T T AAAC C Aa C T C Ag ACGT C Ag T C AG T C AGT C AAC AAC AGT AT C AC 
CAgCTTCTGTTGCCGCTGAAACACCAGCTCCAgTAGCTAAaGTAGCACCG 
G T AAG AAC T GT AG C AGC C C C TAG AGT G G C AAG T GT T AAAGT AGT C ACT C C 
TAAAGTAGAAACTGGTGCATCACCAGAGCATGTATCAGCTCCAGCAGTTC 
CT GT GAC T AC G ACT T C AAC AGCT AC AG AC AGT AAGT T AC AAG C GACT GAA 
GTTAAGAGCGTTCCGGTAGCACAAAAAGCTCCAACAGCAACACCGGTAGC 
AC AAC C AGC T T C AAC AAC AAAT G C AG TAG C T G C AC AT C CT GAAAAT G C AA 
GGCTCCAACCTCATGTTGCAGCTTATAAAGAAAAAGTAGCGTCAACTTAT 
GG AGT T AAT G AAT T C AGT AC AT AC C GT G C GGGAG AT C C AG GT GAT C AT GG 
TAAAGGTTTAGCAGTTGACTTTATTGTAgGTAAAAACCAAGCACTTGGTA 
AT G AAGT T G C AC AGT AC T C T AC AC AAAAT AT G G C AGC a AAT AAC AT T T C A 
TATGTTATCTGGCAACAAAAGTTTTACTCAAATaCAAATAGTATTTATGG 
ACcTGCTAATACTTGGAATGCAATGCCAGATCGTGGTGGCGTTAcTGCCA 
AC C a C TAT G AC C AC GT T C AC G T AT CAT T T AAC AAAT a AT AT AAAAAAGGA 
AGCTaTTTGGCTTCTTTTTTATATGCCTTGCATAGACtTTCAAGGTTCTT 
ATATAATTTTTATTA 

SEQ ID NO. 6904 

STRAIN H3 6B 

C T GAT T T G GT AAAG C AAG AC AAT AAAT CAT CAT AT Ac TGT GAAAT A 
T GG T GAT AC Ac T AAGCGT T ATT T C AG AAG C AAT G T C a AT T GAT AT G AAT G 
T CT T AG C AAAAAT T AAT AAC AT T G C AG AT AT C AAT C T TAT T T AT C c T GAG 
AC AAC a C T G a C AG T Aa C T T ACG AT C AG AAGAG T CAT AC T G CT ACT T C AAT 
GAAAAT AG AAAC AC C AGC AAC AAAT GCTGCTGGT C AAAC AAC AG C TACT G 
TCGATTTGAAAACCAATCAAGTTTCTGTTGCAGACCAAAAAGTTTCTCTC 
AAT AC AAT T T C G GAAG G TAT GAC AC C AG AAG C AG C AAC AACGAT T GT T T C 
GCCAATGAAGACATATTCTTCTGCGCCAGCTTTGAAATCAAAAGAAGTAT 
TAG C AC AAGG G C AAG C T GT T AGT C AAG C AG C AG C T AAT G AAC AG GT AT C A 
CC AGCT CCTGTGAAGTCGATT ACT TCAGAAGTTCCAGCAGCTAAAGAGGA 
AGT T AAAC C AAC T C AG AC GT C AG T C AGT C AG T C AAC AAC AGT AT C AC C AG 
CTTcTGTTGCCGCTGAAACACCAGCTCCAGTAGcTAAAGTAGCACCGGTA 
AG AACT GT AGC AGC CCcT AG AGT GGCAAGTGTT AAAGT AGT C ACT CcTAA 
AGT AGAAAC T G GT G CAT C AC C AG AG CAT GT AT C AG C T CC AG C AGT T C CT G 
T GAC T AC GAC T T C AAC AG CT AC AG AC AGT AAG T T AC AAG C GAC T G AAGT T 
AAGAG C G T T C C G GT AG C AC AAAAAG CT C C AAC AG C AAC AC C G GT AG C AC A 
AC C AG C T T C AAC AAC AAAT G C AG T AG C T G C AC AT C CT GAAAAT GC AAG G C 
T C C AAC C T CAT GT T G C AG CT T AT AAAGAAAAAG T AG C GT C AACT T AT GG A 
G T T AAT G AAT T C AGT AC AT AC C G T G C G G GAG AT C C AG GT G AT CAT G GT AA 
AGGT T TAG C AG T T G AC T T T AT T G T AG G T AAAAAC C AAG C AC T T GGT AAT G 
AAGT T G C AC AG TACT C T AC AC AAAA t a T G G C AG C AAAT AAC AT T T CAT AT 
G T TAT C T G G C a AC AAAAGT T T T AC T C AAAT AC AAAT AG T AT T T AT GG AC C 
TGCTAATACTTGGAATGCAATGCCAgATCGTGGTGGCGTTACTGCCAACC 
ACT AT GAC CACGT T C ACGT AT CAT TT AAC AAAT AAT AT AAAAAAGGAAGC 
TATTTGGCTTCTTTTTTATATGCCTTGCATAGACtTTCAAGGTTCTTATA 
TAATTTTTATTA 

SEQ ID NO. 6905 

STRAIN 18RS21 

CT GAT T T GGTAAAGCAAGAC AAT 
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AAAT CAT CAT AT ACT GT GAAAT AT GGT GAT AC Ac T AAG cGTT ATT T CAGA 
AG C AAT GT C AAT T GAT AT GAAT GT C T TAG C AAAAa T AAAT AAC AT T GC AG 
AT AT C AAT CT TAT T T AT C c T G AGAC AAC a C T Ga C AGT AACT T AC G AT C AG 
AAGAGT CAT AC TGCCaCTT C AAT GAAAAT AG AAAC AC C AGC Aa C AAAT G C 
T G CT GGT C Aa AC Aa C AG CT AC T GT GG AT T T G AAAAC C AAT C Aa GT T T CT G 
T T G C AGAC C AAAAAGT T T C T C T C AAT AC AAT T T C GG AAG GT AT GAC AC C A 
GAAGCAGCAACAACGATTGTTTCGCCAATGAAGACaTATTCTTcTGCGCC 
AG CT T T G AAa T C AAAAG AAGT AT T AG C ACAAG AG C AAGC T GT T AGT C AAG 
CAGCAGCTAATGAACAGGTATCACCAGCTCCTGTGAAGTCGATTACTTCA 
G AAGT T C C AG C AG C T AAAG AG G AAGT T AAAC C AACT C AG ACGT C AGT C AG 
TCAGTCAACAACAGTATCACCAGCTTCTGTTGCCGCTGAAACACCAGCTC 
CAGTAGCTAAAGTAGCACCGGTAAGAACTGTAGCAGCCCCTAGAGTGGCA 
AGT GT T AAAG T AGT C ACT C C T AAAGT AG AAACT G G T G CAT C AC C AGAGC A 
T GT AT C AG C T C C AG C AGT T C C T GT GAC T ACGAC T T C AC C AG CT AC AG AC A 
GTAAGTTACAAGCGACTGAAGTTAAGAGCGTTCCGGTAGCACAAAAAGCT 
C C AAC AG C AAC AC CG GT AG C AC AAC C AG C T T C AAC AAC AAAT G C AG TAG C 
T GC ACAT CCT GAAAAT GCAGGGCTCC AAC CT CAT GTTGCAGCT TAT AAAG 
AAAAAGT AG C G T C AAC T T AT GGAGT T AAT GAAT T C AG T AC AT AC CGT GCG 
GG AGAT C C AGGT GAT CAT GGT AAAG GT T TAG C AG T T GAC T T TAT T GT AGG 
TACTAATCAAGCACTTGGTAATAAAGTTGCACAGTACTcTACACAAAATA 
TGGCAGCAAATAACATTTCATATGTTATCTGGCAACAAAAGTTTTACTCA 
AAT AC AAAC AGT AT T TAT G GAC C T G CT AAT AC T T GG AAT G C AAT G C CAGA 
TCGTGGTGGCGTTACTGCCAACCACTATGACCACGTTCACGTATCATTTA 
ACAAATAATATAAAAAAGGAAGCTATTTGGCTTCTTTTTTATATGCCTTG 
AATAGACTTTCAAGGTTCTTATATAATTTTTATTA 

SEQ ID NO. 6906 

STRAIN COH1 
CTGATTT 

GGT AAAG C AAG AC AAT AAAT CAT CAT AT ACT GT GAAAT AT GGT GAT AC AC 
T AAGC GT TAT T T C AG AAGC AAT GT C AAT T GAT AT GAAT GT C T T AG C AAAA 
AT T AAT AAC AT T G C AG AT AT C AAT C T TAT T T AT CCT GAG AC AAC ACT GAC 
AGT AAC T T ACG AT C AG AAG AGT CAT ACT G C C ACT T C AAT GAAAAT AG AAA 
C AC C AG C AAC AAAT GCTGCTGGT C AAAC AAC AG c T AC T GT C GAT T T G AAA 
AC C AAT C AAGT TTTTGTTG C AG AC C AAAAAGT T T C T C T C AAT AC AAT T T C 
G G AAG GT AT GAC AC C AG a a G C AG C AAC AAC GAT TGTTTCGC C AAT G AAG A 
CaTATTCTTCTGCGCCAGCTTTGAAATCAAAAGAAGTATTAGCACAAGAG 
CAAGCTGTTAGTCAAGTAGCAGCTAATGAACAGGTATCACCAGCTCCTGT 
G AAGT CG AT TACT T C AG AAGT T C C AGC AGC T AAAG AG G AAGT T AAAC C AA 
CTCAGACGTCAGTCAGTCAGTTAACAACAGTATCACCAGCTTCTGTTGCC 
GCTGAAACACCAGCTCCAGTAGCTAAAGTAGCACCGGTAAGAACTGTAGC 
AGCCCCTAGAGTGGCAAGTGcTAAAGTAGTCACTCcTAAAGTAGAAACTG 
GTGCATCACCAGAGCATGTATCAGCTCCAGCAGTTCCTGTGACTACGACT 
T C AC C AG CT AC AG AC AGT AAGT T AC AAG C GAC T GAAGT T AAGAG C GT T CC 
GGT AG C AC AAAAAGC T CC AAC AG C AAC AC C G G TAG C AC AAC C AG C T T C AA 
C AAC AAAT G C AGT AG C T G C AC AT CCT GAAAAT G C AG GG CT C C AAC CT CAT 
GT T G C AGC T TAT AAAG AAAAAGT AGC GT C AAC TT AT G G AGT T AAT GAAT T 
C AGT AC AT AC CG T G C GGG AG AT C C AGG T GAT CAT GGT AAAGGTT T AG C AG 
T T G ACT T TAT T G T AGGT AAAAAC C AAG C ACT T G GT AAT G AAG T T G C AC AG 
TaCTCTACACAAAATATGGCAGCAAATAACATTTCATATGTTATCTGGCA 
AC AAAAGT T T TAT T C AAAT AC AAAT AGT AT T TAT G GAC C TG C T AAT ACTT 
G GAAT G C AAT G C C AG AT CGTGGTGG C GT T AC T G C C AAC C AC TAT G ACC AC 
GT T C AC GT AT CAT TT AAC AAAT AAT AT AAAAAAG G AAG C T AT TTGGCTTC 
TTTTTTATATGCCTTGAATAGACTTTCAAGGTTCTTATATAATTTTTATT 
A 

SEQ ID NO. 6907 

STRAIN M732 

CTGATTTGGTAAAGCAAGACAATAAATCATCATATACTGTGAAATATGGT 
G AT AC AnT AAGCGTT ATT TCAGAAGCAATGTCAATT GAT AT G AAT GTCTT 
AG C AAAAAT T AAT AAC AT T GC AG AT AT C AAT C T TAT T T AT CCT G AG AC AA 
CACTGACAGTAACTTACGATCAGAAGAGTCAtACTGCCACTTCAATGAAA 
AT AG AAAC AC C AG C AAC AAAT G CT G C T GGT C AAAC AAC AG CT ACT GT c G A 
T T TGAAAACC AAT C AAGTTTTTGT T GC AGAC CAAAAAGT T T CT CT CAAT A 
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C AAT T T C GG AAGG T AT GAC AC C AGAAG C AG C AAC AAC GAT TGTTTCGC C A 
AT GAAG AC AT AT TCTTCTGCGC C AG CT T T GAAAT C AAAAGAAGT AT T AG C 
ACAAGAGCAAGCTGT T AGT C AAGT AGCAGCT AAT GAACAGGT AT C ACC AG 
CT C C T GT G AAGT C GAT TAG T T C AG AAGT T C C AG C AG CT AAAG AGG AAGT T 
AAAC C AAC T C AG AC GT C AGT C AGT CAGT T AAC AAC AGT AT C AC C AG C T T C 
TGTTGCCGCT G AAAC AC C AG CT C C AGT AG C T AAAGT AG C AC CG G T AAG AA 
CTGTAGCAGCCCCTAGAGTGGCAAGTGCTAAAGTAGTCACTCCTAAAGTA 
GAAACTGGTGCATCACCAGAGCATGTATCAGCTCCAGCAGTTCCTGTGAC 
T AC GAC T T C AC C AG C T AC AG AC AGT AAGT TAG AAG C GAC T GAAG T T AAG A 
G CGT T C C G GT AG C AC AAAAAG CT C C AAC AG C Aa C AC C G GT AG C AC AAC C A 
GC T T C AAC AAC AAAT G C AGT AGC T G C AC AT C CT G AAAAT G C AG GG CT C C A 
AC C T C AT GT T GC AG C T TAT AAAG AAAAAG TAG CGT C AAC T TAT GG AGT T A 
AT GAAT T CAGT AC AT AC C G T G C GG G AG AT C C AG GT G AT CAT GGT AAAG GT 
TTAGCAGTTGACTTTAttgtaggtaaaaaccAAGCACTTGGTAATGAAGT 
T G C AC AGT ACT c T AC AC AAAAT AT GG C AG C AAAT AAC AT T T C AT AT GT T A 
TCTGGCAACAAAAGTTTTATTCAAATACAAATAGTATTTATGGACCTGCT 
AATACTTGGAATGCAATGCCAGATCGTGGTGGCGTTACTGCCAACCACTA 
TGACCACGTTCACGTATCATTTAACAAATAATATAAAAAAGGAAGCTATT 
TGGCTTCTTTTTTATATGCCTTGAATAGACTTTCAAGGTTCTTATATAAT 
TTTTATTA 

SEQ ID NO. 6908 

STRAIN M781 

CT G AT T T GG T AAAG C AAG AC AAT AAAT CAT CAT AT ACT GT GAAAT AT GGT 
GAT AC ACT AAG CGT TAT T T C AG AAG C AAT GT C AAT T GAT AT GAAT GT C T T 
AGC AAAAAT T AAT AAC AT T G C AGAT AT C AAT C T TAT T TAT C CT G AG AC AA 
C ACT GAC AGT AAC T T AC GAT C AG AAG AGT CAT AC T GC C AC T T C AAT G AAA 
ATAGAAACACCAGCAACAAATGCTGCTGGTCAAACAACAGCTACTGTCGA 
TTTGAAAACCAATCAAGTTTTTGTTGCAGACCAAAAAGTTTCTCTCAATA 
CAATTTCGGAAGGTATGACACCAGAAGCAGCAACAACGATTGTTTCGCCA 
AT GAAG AC AT AT TCTTCTGCGC C AG C T T T GAAAT C AAAAGAAGT AT T AGC 
AC AAG AG C AAG C T GT T AGT C AAG TAG C AG CT AAT G AAC AG G TAT C AC C AG 
CT C CT GT G AAGT C G AT T AC T T C AG AAGT T C C AG C AG CT AAAGAGG AAGT T 
AAAC C AACT C AGACG T CAGT CAGT C AG T T AAC AAC AGT AT C AC C AG C T T C 
TGTTGCCGCT G AAAC AC C AGC T C CAGT AG C T AAAG TAG C AC C G GT AAG AA 
CT GT AGC AG C C C CT AGAG T G G C AAGT G C T AAAGT AGT C AC T C CT AAAG T A 
GAAACTGGTGCATCACCAGAGCATGTATCAGCTCCAGCAGTTCCTGTGAC 
T AC G ACT T C AC C AG CT AC AGAC AGT a a GT T AC AAG C GAC T G AAGT T AAG A 
GC GT T C CGGT AG C AC AAAAAG C T C C AAC AGC AAC AC C GGT AGC AC AAC C A 
G CT T C AAC AAC AAAT G CAGT AG CT GC AC AT C C T G AAAAT G C AGG GCT CCA 
AC CT CAT GT T G C AG C T T AT AAAG AAAAAGT AG CGT C AACT TAT GG AG T T A 
AT GAAT T C AG T AC AT AC CG T G C G G G AGAT C C AG GT G AT CAT GGT AAAG GT 
T TAG C AGT T G ACT T TAT T GT AGGT AAAAAC C AAG C ACT T GGT AAT G AAGT 
T G C AC AGT AC T C T AC AC AAAAT AT G G C AG C AAAT AAC AT T T CAT AT GT T A 
TCTGGCAACAAAAGTTTTATTCAAATACAAATAGTATTTATGGACCTGCT 
AAT AC T T G GAAT G C AAT G C C AG AT CGTGGTGGCGT T AC T G C C AACC AC T A 
T GAC C AC G T T C AC GT AT CAT T T AAC AAAT AAT AT AAAAAAGG AAGC T a T T 
TGGCTTCTTTTT T AT AT GC C T T G AAT AgAC T T T C AAGGT T C T TAT AT AAT 
TTTTATTA 

SEQ ID NO. 6909 

STRAIN CJB110 

CTGATTTGGT AAAGCAAGAC AAT AAAT CAT CAT AT ACT GTG AAA 
TAT GGT GAT AC ACT AAG CGT TAT T T C AG AAGC AAT G T C AAT T GAT AT G AA 
T G T C T TAG C AAAAAT T AAT AAC AT T G C AG AT AT C AAT C T TAT T TAT C CT G 
AG AC AAC AC T GAC AGT AAC T T AC GAT C AG AAG AGT CAT AC T G C C AC T T C A 
AT G AAAAT AG AAAC AC C AG C AAC AAAT G CT GCT GGT C AAAC ACC AG CT AC 
TGTGGATTTGAAAACCAATCAAGTTTcTGTTGCAGACCAAAAAGTTTCTC 
T C AAT AC AAT T T C G G AAG GT AT GAC AC C AG AAG C AG C AAC AAC GAT T GT T 
T CGCCAAT G AAG AC AT ATT CTTCTGCGCCAGCTTTG AAAT C AAAAGAAGT 
AT TAG C AC AAG AG C AAG C T GT T AGT C AAG C AG C AG CT AAT G AAC AG GT AT 
C AAC AG C T C C T G T G AAGT C G AT T AC T T C AG AAG T T C C AG C AG C T AAAG AG 
G AAGT T AAAC C AAC T C AG AC GT CAGT CAGT C AG T C AAC AAC AGT AT C AC C 
AgCTTCTGTTGCCGCTG AAAC AC CAGCTCC AGT AGCT AAAGT AGC ACCGG 
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T AAg AACT GT AG C AG C C C CT AG AG T GG C AAGT GT T AAAG TAG T C AC T C C T 
AAAGTAGAAACTGGTGCATCACCAGAGCATGTATCAGCTCCAGCAGTTCC 
TGTGACT ACGACTT C AACAGc T ACAGACAGT a AGTT a C AAGCGACT GAAG 
TTAAGAGCGTTCCGGTAGCACAAAAAGCTCCAACAGCAACACCGGTAGCA 
CAAC C AGCTT CAACAACAAATGCAGTAGCTGCACATCCTGAAAAT GCAGG 
GCTCCAACCTCATGTTGCAGCTTATaAAGAAAAAGTAGCGTCAACTTATG 
GAGT T AAT GAAT T C AGT AC AT a C CGT G C AG GT GAT C C AgGT GAT CAT GGT 
AAAGGT T TAG C AGT c G AC T T T AT T GT Ag GT AAAAAC C AAG C AC T T G GT AA 
T G AAGT T G C AC AGT ACT C T AC AC AAAAT AT GG C AG C AAAT AAC AT T T CAT 
AT GT T AT CTGGCAACAAAAGT TT TACT CAAAT ACAAAT AGTAT TTAT GGA 
CCTGCTAATACTTGGAATGCAATGCCAGATCGTGGTGGCGTTACTGCCAA 
C CAT TAT G AC CAT G T T C AC GT AT CAT T T AAC AAAT AAT AT AAAAAAG GAA 
GCTATTTGGCTTCTTTTTTATATGCCTTGAATAGACtTTCAAGGTTCTTA 
TATAATTTTTATTA 

SEQ ID NO. 6910 

STRAIN 1169NT 
CTGATTTG 

GT AAAGC AAG AC AAT AAAT CAT C AT AT ACT GT GAAAT ATGGT GAT AC ACT 
AAG CGT TAT T T C AG AAG C AAT G T C AAT T GAT AT GAAT G T CT T AG C AAAAA 
T T AAT AAC AT T G C AGAT AT C AAT CT TAT T TAT C c T GAG AC AAC AC T G AC A 
GT AACTTACGAT CAgAAGAGT CAT ACTGCCACT T CAAT GAAAATAGAAAC 
AC CAGC AAC AAAT GCTGCTGGT C AAAC AAC AG C TACT GT GG AT T T G AAAA 
CCAATCAAGTTTCTGTTGCAGACCAAAAAGTTTCTCTCAATACAATTTCG 
G AAGGT AT G AC AC C AG AAG C Ag CAAC AAC GAT T GT T T C G C CAAT G AAGAC 
ATATTCTTCTGCGCCAGCTTTgAAATCAAAAGAAGTATTAGCACAAGAGC 
AAG C T G T T AG T C AAG CAGC AG C T AAT G AAC AGGT AT C AC C AG CT C CT G T G 
AAG T CG AT T AC T T C Ag AAGT T C C Ag C AG C T AAAG AG G AAGT TAG AC C Aa C 
TcAGACGTCAGTCAGTCAGTCAACAACAGTATCACCAgCTTCTGTTGCCG 
CTGAAACACCAGCTCCAGTAGCTAAAGTAGCACCGGTAAGAACTGTAGCA 
G C C C C AGC C C C T AGAGT G G C AAGT G C T AAAGT AGT C ACT C C T AAAGT AGA 
AAc T GGT GC ATC ACC AG AGC AT GT ACC AGCT C CAGC AGT TcC TGTGACT A 
c G AC T T CAAC AG C T AC a G AC Aa T a AG T T AC AAG C G AC T G AAGT TAAg AGC 
GtTCCGGTgGCACAAAAAGCTCCAACAGCAACACCGGTaGCACAACCAGC 
TT c AACAAC AAAT GCAGTAGc T GCAC AT C CT GAAAAT GCAGG ACT CCAAC 
CTCATGTTGCAGCTTATAAAGAAAAAGTAGCGTCAACTTATGGAGTTAAT 
GAAT T C AGT AC AT aC C GT G C G GG AGAT CC AGG T GAT CAT G GT AAAGGT T T 
AG C AGT T G AC T T T AT T GT a g GT AAAAAC C AAG C AC T T GGT AAT G AAGT T G 
C AC AGT AC T C T AC AC AAAAT AT GG C AG CAAAT AAC AT T T CAT AT GT TAT C 
T G G C AAC AAAAGT T T T ACT CAAAT AC AAAT AGT AT T T AT GG AC CT G C T AA 
TACTTGGAATGCAATGCCAGATCGTGGTGGCGTTACTGCCAACCACTATG 
AC C AC GT T C AC GT AT CAT T T AAC AAAT AAT AT AAAAAAG GAAG C T AT T T G 
GCTTCTTTTTTATATGCCTTGAATAGACTTTCAAGGtTCTTATATAATTT 
TTATTA 

SEQ ID NO. 6911 

strain jm9130013 

c tgat t t g g t aaagc aag ac aat aaat cat cat at act 
g t gaaat at ggt gat ac ac taag c gt t at t t c ag aag caat gt caat t g a 
tat gaat gt c t tag c aaaaat aaat aac at t g c ag at at caat ct t at t t 
at c c t gag ac aac ac t g ac agt aac t t ac gat c ag aag ag t cat ac t g cc 
acttcaatgaaaatagaaacaccagcaacaaatgctgct;ggtcaaacaac 
agctactgtggatttgaaaaccaatcaagtttctgttgcagaccaaaaag 
tttctctcaatacaatttcggaaggtatgacaccagaagcagcaacaacg 
attgtttcgccaatgaagacatattcttctgcgccagctttgaaatcaaa 
agaagtattagcacaagagcaagctgttagtcaagcagcagctaatgaac 
agg tat c ac c ag c t c c t gt gaag t c gat t act t c ag aag t t c c ag c ag ct 
aaag ag gaag t t aaac c aac t c ag ac g t c ag t c ag t c ag t caac aac ag t 
at c acc ag cttctgttgccgctgaaac acc agct cc agt agct aaagt ag 
caccggtaagaactgtagcagcccctagagtggcaagtgttaaagtagtc 
act c c t aaagt ag aaact ggt g c at c ac c ag ag cat gt at c ag c t c c ag c 
agt t c c t g t g ac t ac g ac t t c ac c ag c t ac ag a c agt aag t t ac aag c g a 
c t gaag t taag ag c g t t c c ggt ag c ac aaaaag c t c caac ag caac ac c g 
gt ag c a caac c ag ct t caac aac aaat gc agt ag ct g c ac at c ct g aaaa 
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TGCAGGGCTCCAACCTCATGTTGCAGCTTATAAAGAAAAAGTAGCGTCAA 
CTTATGGAGTTAATGAATTCAGTACATACCGTGCGGGAGATCCAgGTGAT 
C AT GGT AAAG GT T T AGC AGT T G AC T T T AT T GT AGGT ACT AAT C AAG C AC T 
T GGT AAT AAAGT T G C AC AGT ACT C T AC AC AAAAT AT G G C AGC AAAT AAC A 
T T T C AT AT GT TAT CT GG C AAC AAAAGT T T T AC T C AAAT AC AAAC AGT AT T 
TATGGACCTGCTAATACTTGGAATGCAATGCCAGATCGTGGTGGCGTTAC 
TGCCAACCACTATGACCACGTTCACGTATCATTTAACAAATAATATAAAA 
AAGGAAGCTATTTGGCTTCTTTTTTATATGCCTTGAATAGACTTTCAAGG 
T T C T TAT AT AAT T T T TAT T A 

SEQ ID NO. 6912 
STRAIN 2603 frame: 1 

MNKKVLLTSTMAASLLSVASVQAQETDTTWTARTVSEVKADLVKQDNKSSYTVKYGDTLS 
VISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSHTATSMKIETPATNAAGQTTA 
T VDLKTNQVS VADQKVS LNT I S EGMT PE AATTI VS PMKT YS SAP ALKSKEVLAQEQAV S Q 
AAANEQVS PAPVKS ITS E VPAAKEE VKPT QT SVSQSTTVS PASVAAE T PAPVAKVAP VRT 
VAAP RVA S VKWT PKVE T GAS PE H V S A P AV P VT T T S PAT D S KLQ AT E VK S V P VAQKA P T A 
TPVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVNEFSTYRAGDPGDHGKGLAVD 
FIVGTNQALGNKVAQYSTQNMAANNISYVIWQQKFYSNTNSIYGPANTWNAMPDRGGVTA 
NHYDHVHVS FNK . YKKGS YLAS FLYALNRLSRFLYNFY 

SEQ ID NO. 6913 

STRAIN 0 90 frame: 2 

ETTLTVTYDQKSHTATSMKIETPATNAAGQT PAT VDLKTNQVS VADQKVS LNT I SEGMTP 
E AATTI VS PMKT YSSAPALKSKEVLAQEQAVSQAAANEQVSTAPVKS IT SEVPAAKEEVK 
PTQT SVSQSTTVS PAS VAAET PAPVAKVAP VRT VAAPRVAS VKWT PKVETGAS PEHVSA 
PAVPVTTTSTATDSKLQATEVKSVPVAQKAPTATPVAQPASTTNAVAAHPENAGLQPHVA 
AYKEKVASTYGVNEFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNIS 
YVIWQQKFYSNTNS I YGPANTWNAMPDRGGVTANHYDHVHVS FNK . YKKGS YLAS FLYAL 
NRLSRFLYNFY 

SEQ ID NO. 6914 

STRAIN A90 9 frame: 3 

DLVKQDNKSSYTVKYGDTLSVI SEAMS IDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
T AT SMKIETPATNAAGQTT AT VDLKTNQVS VADQKVS LNT I SEGMTPEAATT I VS PMKT Y 
SSAPALKSKEVLAQGQAVSQAAANEQVS PAPVKS ITSEVPAAKEEVKPTQTSVSQSTTVS 
PAS VAAET PAPVAKVAPVRT VAAPRVAS VKWT PKVETGAS PEHVSAPAVPVTTTSTATD 
SKLQATEVKSVPVAQKAPTATPVAQPASTTNAVAAHPENARLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNISYVIWQQKFYSNTN 
S I YGPANTWNAMPDRGGVTANHYDHVHVS FNK . YKKGS YLAS FLYALHRLSRFLYN FY 

SEQ ID NO. 6915 

STRAIN H3 6B frame: 3 

DLVKQDNKSSYTVKYGDTLSVI SEAMS IDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TAT SMKIETPATNAAGQTTATVDLKTNQVSVADQKVSLNTI SEGMTPEAATT I VS PMKT Y 
SSAPALKSKEVLAQGQAVSQAAANEQVSPAPVKS ITSEVPAAKEEVKPTQTSVSQSTTVS 
PAS VAAET PAPVAKVAP VRT VAAPRVAS VKWT PKVETGAS PEHVSAPAVPVTTTSTATD 
SKLQATEVKSVPVAQKAPTAT PVAQPASTTNAVAAHPENARLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNISYVIWQQKFYSNTN 
SI YGPANTWNAMPDRGGVTANHYDHVHVS FNK. YKKGSYLASFLYALHRLSRFLYNFY 

SEQ ID NO. 6916 

STRAIN 18RS21 frame: 3 

DLVKQDNKSSYTVKYGDTLSVI SEAMS I DMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TATSMKIETPATNAAGQTTATVDLKTNQVSVADQKVSLNTISEGMTPEAATTIVSPMKTY 
SSAPALKSKEVLAQEQAVSQAAANEQVS PAPVKS ITSEVPAAKEEVKPTQTSVSQSTTVS 
PASVAAET PAPVAKVAPVRT VAAPRVAS VKWT PKVETGAS PEHVSAPAVPVTTTS PATD 
SKLQATEVKSVPVAQKAPTATPVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGTNQALGNKVAQYSTQNMAANNISYVIWQQKFYSNTN 
S I YGPANTWNAMPDRGGVTANHYDHVHVS FNK . YKKGS YLAS FLYALNRLSRFLYNFY 

SEQ ID NO. 6917 

STRAIN M732 frame: 3 

DLVKQDNKSSYTVKYGDTXSVISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
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TATSMKIETPATNAAGQTTATVDLKTNQVFVADQKVSLNTISEGMTPEAATTIVSPMKTY 
SSAPALKSKEVLAQEQAVSQVAANEQVSPAPVKSITSEVPAAKEEVKPTQTSVSQLTTVS 
P AS V AAE T P AP VAKVA P VRT VAAPRVA S AKWT P KVE T GAS PE H V S AP AV P VT T T S PAT D 
SKLQATEVKSVPVAQKAPTAS PVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNISYVIWQQKFYSNTN 
S I YG PANTWNAMP DRGG VT ANH YDHVHVS FNK . YKKGS YLAS FLYALNRLSRFLYNFY 

SEQ ID NO. 6918 

STRAIN COH1 frame: 3 

DLVKQDNKSSYTVKYGDTLSVISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TAT SMKI ET PATNAAGQTT AT VDLKTNQ VFV AD QKVS LNT I S EGMT PEAATT I VS PMKT Y 
S S AP ALK S KE VL AQE Q AV S Q VAAN E Q V S P AP VKS I T S E V P AAKE E VK P TQTSVSQLTTVS 
PAS VAAET PAPVAKVAPVRT VAAPRVA S AKWT PKVETGAS PEHVS AP AVP VTTT S PAT D 
SKLQATEVKSVPVAQKAPTATPVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNISYVIWQQKFYSNTN 
SI YGPANTWNAMPDRGGVTANHYDHVHVSFNK. YKKGS YLAS FLYALNRLSRFLYNFY 

SEQ ID NO. 6919 

STRAIN M7 81 frame: 3 

DLVKQDNKSSYTVKYGDTLSVISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TATSMKIETPATNAAGQTTATVDLKTNQVFVADQKVSLNTISEGMTPEAATTIVSPMKTY 
SSAPALKSKEVLAQEQAVSQVAANEQVSPAPVKSITSEVPAAKEEVKPTQTSVSQLTTVS 
PASVAAETPAPVAKVAPVRTVAAPRVASAKVVT PKVETGAS PEHVSAPAVPVTTTS PAT D 
SKLQATEVKSVPVAQKAPTATPVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNISYVIWQQKFYSNTN 
SI YGPANTWNAMPDRGGVTANHYDHVHVSFNK . YKKGS YLAS FLYALNRLSRFLYNFY 

SEQ ID NO. 6920 

STRAIN CJB110 frame: 3 

DLVKQDNKSSYTVKYGDTLSVISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TATSMKIETPATNAAGQTPATVDLKTNQVSVADQKVSLNTISEGMTPEAATTIVSPMKTY 
SSAPALKSKEVLAQEQAVSQAAANEQVSTAPVKSITSEVPAAKEEVKPTQTSVSQSTTVS 
PASVAAETPAPVAKVAPVRTVAAPRVASVKVVT PKVETGAS PEHVS APAVPVTTT ST ATD 
SKLQATEVKSVPVAQKAPTATPVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNISYVIWQQKFYSNTN 
SIYGPANTWNAMPDRGGVTANHYDHVHVSFNK.YKKGSYLAS FLYALNRLSRFLYNFY 

SEQ ID NO. 6921 

STRAIN 116 9NT frame: 3 

DLVKQDNKSSYTVKYGDTLSVISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TATSMKIETPATNAAGQTTATVDLKTNQVSVADQKVSLNTISEGMTPEAATTIVSPMKTY 
SSAPALKSKEVLAQEQAVSQAAANEQVSPAPVKSITSEVPAAKEEVRPTQTSVSQSTTVS 
PAS VAAET PAP VAKVAP VRT VAAPAPRVAS AKWT PKVETGASPEHVPAPAVPVTTTSTA 
TDNKLQATEVKSVPVAQKAPTAT PVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYG 
VNEFSTYRAGDPGDHGKGLAVDFIVGKNQALGNEVAQYSTQNMAANNISYVIWQQKFYSN 
TNS I YGPANTWNAMPDRGGVTANHYDHVHVSFNK . YKKGS YLAS FLYALNRLSRFLYNFY 

SEQ ID NO. 6922 

STRAIN JM9130013 frame: 3 

DLVKQDNKSSYTVKYGDTLSVISEAMSIDMNVLAKINNIADINLIYPETTLTVTYDQKSH 
TAT SMKIET PATNAAGQTT AT VDLKTNQVSVADQKVS LNT I SEGMT PEAATT I VS PMKT Y 
SSAPALKSKEVLAQEQAVSQAAANEQVSPAPVKSITSEVPAAKEEVKPTQTSVSQSTTVS 
PAS VAAET PAP VAKVAP VRT VAAPRVASVKVVT PKVETGAS PEHVSAPAVPVTTT S PATD 
SKLQATEVKSVPVAQKAPTAT PVAQPASTTNAVAAHPENAGLQPHVAAYKEKVASTYGVN 
EFSTYRAGDPGDHGKGLAVDFIVGTNQALGNKVAQYSTQNMAANNISYVIWQQKFYSNTN 
SI YGPANTWNAMPDRGGVTANHYDHVHVSFNK . YKKGS YLAS FLYALNRLSRFLYNFY 

SEQ ID. NO. 7001 
STRAIN 2603 

AT G GG AG GGAAAAT G AAT C AAGAAGT C T T AC T AC AAAT GAT GAG AG C C ACT AT T CC T C 
G T GAT AG AG CCTTGCTT GAG G CAT T T T T AT AT T AC C AAG C AG AG CAT T T T GAT GAG G AGT 
G G GAT AG T C T TAT T CAT C AG T T T AT G AC C AAT AG G CAAG AAAT AAAT AAGT C T GT T C AAG 
TACTTCACTTTGAGACAGATGTTTCAGCTTTTGTCCAGGCTAGTCCTTATGATACTGCTC 
AT GAT CT AT T G AC C TAT AC AC AAGT TT T C G GC C AAAG T G GT C T T C AAAAAC T AG AT AAAC 
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TATCGCCGTCTGAAAAAAACTTGGTGATAGAAGTGGCCTTGTTCAATCTGGCCACTCGTT 
TTCAATTATTGGATTCCAATGGACACTACCAAACCATATCGCCGGATTCACTCTTACAAA 
AGAGTAGGGGAGCTAATTTGGTCAATGTGTATCGTGTGGCTAATAATTTAGCGGATCGTA 
TTAGTCGAGATATTGAACAGTTTCTCTTAACTTACGAGCCTGAGCTTGAAACTAGAGCTG 
AT G AAACT GT T C TAG AAAAT GAAG AAACT G T T GAT GAG C AC AAAAC AAGT GT T CAT C AAG 
C AAT AT CT T T T C GAG AAG AG GGCTCTCTGGT TAT T G C T AGT T T GG AT GT AGAT TT GT CT C 
AACT AG AT GT T C AAAT AG G AAAAAC C AGT CAT C T G C C AG CT T AT G AAG AGT TAT C CT T AC 
G AC GT AAAT T T GAG AT T C T AAC AT AT T T T G AC C AAAT T CG AAAT G AACGT T C CAAAGT C C 
C AAGT T T T AGAC G AGGT GAT T T T G AC AC AG AG AT G GAAAT G AC AC C AGT C T T T GAT G GC G 
AGGAATTACTTACTTATCTCGAAGCTGATGGCAGTCCCTATGAGCTGAAACGAACGCTGA 
CT ACAGTCGAAGAAAAGGAATT AGAAAAAAT T GGAC AAGC CATT AGGAT AGAAAAT C AAG 
AAAAATTGACTCAGCTAGGGATTGATTTATCTCAGTTTGACCCAGACCGAGTCGGTATTT 
TATTGGATGCAGCAGGTCGTTTTCGTTTAAAAAATGCAGACCTTGCTTTACTAGGTGGTT 
AT C C C AAAG C CT C GGT AAC T C AAC T AGC C CT T G C G AC AG AACT AC T C C AAAT GGG ACT AA 
GTCATGAAAAGGTTGAATTTTTCTTTGGTAGCCAGCTTTCCATTGAAGAGCTGCGACAAG 
T T G C C T AC G C CT T T T TAT AC C AAG AACT C AG C AG AGAAG AT G CGG AGC AAT T T G AAAAAG 
AT AAAGGT AAT CAGC CAGAT TT AACT CT CAGAGATTGGAAAAGCAAGCTAGAGAAAGCT G 
AG G G AAAAGAAGT AGT T GAT GAAG AAT T C GCGG AAAAT C C AC T G GT T C AG AG AGT AT T G G 
AC AC T TAT CCTCTGGGGT CAT T GG T T T C C TAT AAG G G AC AG G AC T T T G AGGT CAT GT C GG 
TCAGCGATGCTCGATTGAACGGTTTGATTCGGATTGAGTTAGTCAATGACTTTTCGGATA 
T C AT T G AAC AAAAT C C AGT T C T T T AT GT G AGG AC CT GGG AAG AAG T C AGT C AGG C ACT T C 
AT C AG C C AAAG G C AG AAC C AC AAAC AG AG T TAG AAG AAG C GG AC C AAG AAT T AAAC CT AT 
T CT CATTT CT GG AAG AG G AGC C AGT T C AG AGT AT T GG ACT AT T GGAAC CAGAT GAT T CAG 
AAAAT GGT CAT AACGAT ACT GAT CTT GAAG AAAC AGAT AAT CAAATT C CTGAAGAGGAAG 
TCGTCGAAACAATTCCAGAGATTCCAGTAACGGACTTTTATTTTCCAGAAGATTTGACGG 
ACTTTTATCCTAAGACTGCTAGAGATAAGGTTGAGACAAACATTGTGGCCATTCGTTTGG 
T AAAAAAT C TAG AAG TAG AGC AC CG C AAT GC T T C AC C AAGT G AAC AAG AACT CCTTGCCA 
AGTATGTAGGCTGGGGTGGACTAGCCAATGAATTTTTTGATGACTATAATCCAAAATTTT 
CTAAGGAACGAGAAGAACTGAAGAGCCTAGTCACAGATAAAGAGTATTCGGATATGAAAC 
AGTCCTCCCTGACAGCCTATTACACAGACCCATCCCTGATCCGTCAGATGTGGGATAAGT 
T G G AAAGAG AT GG CT T T AC AG GT GG C AAAAT C CT AG AT C CT T C CAT GGG AAC AG GG AAT T 
TCTTTGCGGC TAT G C C AAAAC AC T T AAG AG AAAAG AGT G AGT T GT AT G G C G TAG AG T T AG 
AT ACT AT T AC AG GAG CT AT T GCC AAAC AC CT T CAT C C C AAT AGT CAT AT T GAAAT T AAG G 
GATTTGAGACGGTGGCTTTTAACGACAATAGTTTTGATTTGGTGATTTCAAATGTGCCCT 
TTGCCAATATACGAATTGCGGATAATAGGTACGATAGGCCTTACATGATTCATGACTACT 
TTGTCAAAAAGTCACTTGATTTGCTTCATGATGGTGGACAAGTAGCGATTATCTCTTCCA 
CAG G AAC TAT GG AT AAG C G AAC AGAAAAC AT C T T AC AAG AT AT T CG T GAG AC AAC T G AAT 
TTCTTGGTGGGGTTCGACTGCCTGACTCTGCCTTTAAGGCCATTGCAGGAACGAGTGTCA 
C AAC G GAT AT GT T AT T C T T C CAG AAAC ACT TAG AC AAGG GAT AT G T GG CAG AC G ATT TAG 
CCTTTTCAGGTTCCATTCGCTATGACAAGGATAGTCGCATTTGGCTCAATCCTTATTTTG 
ATGGAGAATACAATAGCCAGGTGCTAGGAACCTACGAGGTCAGGAATTTTAACGGAGGAA 
CACTTTCTGTTAAGGGGACTAGTGATGACTTGATTGCAAGTGTTGAAACAGCTCTAAATC 
ACGTTAAGGCCCCAAGAGAGATTGATAGAAATGAGGTCATCATTAACCCAGATGTGTTGA 
C C AAACAAGT C AAT GAT AC CT CC ATT CCAGCT GAAAT GAGGG AAAAT CT AGGT C AGT AC A 
G T T T T G GT T AT C AGGGGT C T AC AGT T T AC TAT C GAG AT AAC AAAG G C AT T C G AGT C G G AA 
CCAAGACGGAAGAAATCAGTTACTATGTCGATGAAGAGGGCAACTTCAAAGCATGGGACA 
C C AAAC AT T C T C AAAAG CAGAT T GAT CG C T T T AAT GCC T TAG AAGT G AC T GAT AAC ACT G 
CTCTGGATGTCTATGTGACCGATGATGCAGCCAAACGTGGTCAGTTTAAGGGGTATTATA 
AAAAG AC AG T T T T CT AT G AAGC T C C AT T GT C T T AT AAAGAAGT G GC AC G T AT C AAAG G AA 
T G GT CG AT AT T C G C AAT GC C T AC C AAG AAGT TAT T G C CAT T C AAC G C TAT TAT GAC T AT G 
ATAAGGAGACCTTTAACCACTTGTTAGGCAAACTCAATCGTACCTATGATAGCTTTGTCA 
AAC AC TAT GG GT AT T T G AAT AGT G CT GT G AAC CGC AAT C T T T T T GAT AGT GAT GAT AAG T 
ATTCGCTTCTTGCTAGTTTGGAAGATGAAAGTCTGGATCCAAGTGGAAAGTCTGTTATCT 
ATACTAAATCCCTTGCCTTTGAGAAGGCTCTAQTGCGTCCTGAAAAAGAGGTTAAAAAGG 
TGCATACTGCCCTTGATGCCTTAAATTCGAGCTTGGCTGACGGACGAGGTGTTGATTTCG 
C T TAT AT G AT GT C TAT CT AT C AG GT T G AAT CG CAGAT G AC CT T GAT T GAG G AGT T AGG C G 
AC CT CAT TAT GCC T G AT C CT GAG AAGT AT T T G AAT GG AG AAT T G AC CT AT GTTTCTCGCC 
AAG AC T T T CT T T C AGG GG AT G T C G T C ACT AAGT TAG AAGT GGT AG AT C TAT T CGT C AAAC 
AAGAC AAT CAGGACTT T AACT GGT CACAT TAT GCGGG ACT T CTAGAAGCT AT C AAAC CAG 
CCCGTATTACTTTGGCAGACATTGATTATCGAATCGGTTCACGCTGGATTCCTCTGGCTG 
TTTATGGAAAATTTGCCCAAGAAACCTTTATGGGGAAAGCCTATGAACTGTCAGACCAAG 
AAGT AG C GAC AGT C CT AG AAG T C AGT C C CAT T G ACGG G GT T AT C AC T T AC C AAT C T AAGT 
TTGCCTACACCTATTCCAACGCAACGGATAGGAGTTTAGGTGTCCCTGCTTCACGCTATG 
AT AG T G GT C G AAAAAT C T T T G AAAAT C T C CT G AAT T C C AAT C AAC C AAC CAT C AC AAAAC 
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AAGT T GT C G AAGG GGAT AAG AAAAAGAAT GT GAC G GAT G TAG AG AAAAC AAC GGTCCTGC 
GTGCCAAGGAAACACACCTACAAGAACTCTTTCAAGGTTTTGTAGCAAAGTATCCAGAAG 
T C C AAC AAAT GAT T G AAG AC AC C T AT AAT AGGC T CT AC AAT C GT AC GGT AT C AAAG T C CT 
AT GAT GG T AGT CAT T T AAC CAT T GAT GGACT T G CT CAG AAT AT C T C C T T AC GT C C T C AC C 
AAAAGAATGCCATTCAACGAATTGTCGAGGAAAAACGTGCTCTACTAGCTCATGAAGTTG 
GTTCAGGTAAAACACTTACCATGCTTGGGGCAGGATTCAAACTGAAAGAACTCGGAATGG 
TACATAAACCACTTTATGTGGTGCCGTCTAGTCTGACTGCTCAGTTTGGTCAAGAAATCA 
T G AAAT T T T T C C C T AC C AAG AAAGT CT AT GT G ACT ACT AAG AAAG ACT T T G C C AAAGC C A 
AACGCAAGCAGTTTGTGTCCCGTATTATTACAGGGGACTATGATGCCATTGTCATTGGGG 
AT T C AC AAT T T GAG AAG AT AC C GAT G AGT CGT GAAAAAC AG GT C AC C T AT AT C AAT G AC A 
AAC T T GAG C AACT C C GAG AAAT C AAG C T AG G AAGT GAC AGT G AT T AC AC GG T G AAAG AAG 
CGGAACGTTCGATTAAGGGATTAGAACACCAGTTGGAAGAACTCCAAAAACTAGAGCGAG 
ATACCTTTATTGAGTTTGAAAACCTTGGAATTGATTTTCTTTTTGTGGATGAGGCTCATC 
AC T T C AAGAAT AT C C GT C C AAT C AC T GG AC T T G GG AAT GTAGC T GG AAT C AC C AAC AC AA 
CT T CT AAAAAG AAC GT G GAT AT GG AG AT GAAGGT GAG AC AAGT AC AG G CAG AG C AT GG AG 
ATAGAAATGTCGTTTTTGCGACAGGAACACCAGTTTCTAACTCTATTAGTGAACTTTTCA 
C CAT G AT GGAT T AC AT T C AAC C T G AT GT CT T G G AAC GAT AC C T GG TAT C AAAT T T T GAC T 
CCTGGGTTGGGGCTTTTGGGAATATCGAAAACTCCATGGAACTAGCCCCGACAGGAGATA 
AGTACCAACC CAAGAAACGGTT CAAGAAATTTGT CAACCTT CCTG AACT CAT GCGAATCT 
AC AAGG AAACT G C C GAT AT T CAG AC C T C AGAC AT G CT T GAT T T AC C AGT AC CGG AAGCT A 
AG AT TAT T G C GGT GG AAAGC G AGT T AAC GC AAG C T CAG AAAT AC TAT T T G G AAG AG CT GG 
TAAAGCGTTCAGACGCTATCAAGTCAGGTAGTGTTGATCCAAGTAGAGATAACATGCTTA 
AAAT CACAGGAGAAGCCAGAAAACTAGCTATTGATATGCGGTT GAT TGACCCTACTTACT 
CCTTATCGGATAATCAGAAAATCCTTCAAGTAGTCGATAATGTCGAGCGGATTTACCGTG 
ATGGAGCTGGAGACAAAGCCACTCAGATGATTTTCTCAGATATTGGAACCCCTAAAAGTA 
AG G AAG AAGGGT T T G AT GT C T AC AAT G AAC T T AAG GAC T TGTTTGTC GAT C GAG GGAT AC 
C AAAAG AAGAAAT TGCCTTTGTC CAT G AT GC C AAT AC T GAT G AG AAG AAAAAC T CT CT GT 
CACGCAAGGTCAATAGTGGAGAAGTACGGATTCTCATGGCTTCTACGGAAAAAGGGGGAA 
CAGGATTAAACGTCCAATCTCGCATGAAAGCTGTCCACTATTTAGACGTTCCCTGGAGGC 
C CT C AG AC AT T GT C C AGC G AAAT G GAC GAC T AAT T CGAC AAGGAAAC AT G C AC C AGG AGG 
T AGAT AT T TAT C AC TAT AT T AC T AAAGG GAGC T T T GAC AAT T AC C T CT GG C AG AC G C AGG 
AG AAT AAG C T AAAGT AT AT C AC C C AGAT AAT GAC CT C AAAAG AT C C T GT GAG AT CAG C T G 
AAG AC AT T GAT G AAC AAAC CAT GAC C G C C T CAG ACT T T AAGG CAT T G G C AAC T G GG AAC C 
CT TAT CT CAAACT CAAAATGGAGTTGGAAAAT GAACTG AC AGTTTTAGAGAAT C AAAAAC 
GAG C CT T T AAT C G C T C C AAAGAC GAG TAT C GC C AT AC CAT T T C C TAT AG C GAG AAG C AC C 
T C C C TAT TAT G GAAAAAC GG T T GAG T C AAT AT GAT AAAG AT AT T G C C C AAT C T T T G G C AA 
CCAAGTCGCAAGATTTTGTCATGCGATTTGACAATCAAGCAATGGATAATCGTGCTGAAG 
CTGGGGACTATCTGCGAAAACTCATTACCTATAACCGCTCAGAGACCAAGGAAGTCAGGA 
C AC T T G C CAG C T T TAG AGG AT T T GAT T T AAAAAT G ACT AC ACG AG G T G C T AGT GAGC C C T 
T AC CAG AAAC CAT T T C T T T AAT GAT T GT AGGT GAT AAC C AGT AT AC TGTCGCCCTT GAT T 
TGAAATCAGACGTGGGAACCATTCAACGGATTAGTAATGCCATTGACCATATTATAGATG 
AC C AAG AAAAGAC G C AAG AG C T GGT AAAGGAT T T AAAAG AT AAG C T AC G AGT AG C C AAAG 
TAGAAGTTGAT AAAGT CTTTCCAAAGGAAGAGGACTATCAGCTTGTAAAGGCT AAGT ATG 
AT GT T T TAG CTCCCTTGGTT GAAAAAGAAG CAG AGAT T G AAG AG AT AG AT G C AG CT T T GG 
C C AAGT T T AGT G AAG AT AC AAC AC C C C AAAAGAAG C AAC AAAT AG C ACT C GAG AT A 

SEQ ID. NO. 7002 

STRAIN H3 6B 

GG AGG GAAAAT G AAT C AAGAAGT CT TACT AC AAAT GAT 
GAGAGCCACTATTCCTCGTGATAGAGCCTTGCTTGAGGCATTTTTATATT 
AC C AAG CAG AG CAT T T T GAT GAG G AGT GG GAT AG T CT TAT T CAT C AGT T T 
AT GAC C AAT AGG C AAGAAAT AAAT AAG T C T GT T C AAGT AC T T C AC T T T G A 
GACAGATGTTTCAGCTTTTGTCCAGGCTAGTCCTTATGATACTGCTCATG 
ATCTATTGACCTATACACAAGTTTTCGGCCAAAGTGGTCTTCAAAAACTA 
GATAAACTATCGCCGTCTGAAAAAAACTTGGTGATAGAAGTGGCCTTGTT 
CAATCTGGCCACTCGTTTTCAATTATTGGATTCCAATGGACACTACCAAA 
CCATATCGCCGGATTCACTCTTACAAAAGAGTAGGGGAGCTAATTTGGTC 
AATGTGTATCGTGTGGCTAATAATTTAGCGGATCGTATTAGTCGAGATAT 
T G AAC AGT T T C T C T T AAC T TAG GAG C C T GAG CT T G AAAC TAG AG C T GAT G 
AAACT GT T C T AG AAAAT G AAGAAAC T GT T GAT GAGC AC AAAAC AAGT GT T 
CATCAAGCAATATCTTTTCGAGAAGAGGGCTCTCTGGTTATTGCTAGTTT 
GGAT GT AG AT T T GT C T C AAC T AG AT GT T C AAAT AGG AAAAAC C AGT CAT C 
T G C CAG C T T AT G AAG AGT TAT C C T T AC G AC GT AAAT T T GAG AT T C T AAC A 
TATTTTGACCAAATTCGAAATGAACGTTCCAAAGTCCCAAGTTTTAGACG 
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AGGT GAT T T T G AC AC AG AG AT G G AAAT G AC AC C AG T C T T T GAT G GC G AG G 
AAT T AC T TACT TAT CT C GAAG C T GAT G G C AGT C C C TAT GAG C T G AAAC GA 
ACGCTGACTACAGT CGAAGAAAAGGAAT T AGAAAAAATTGGACAAGCCAT 
TAGGAT AGAAAAT CAAGAAAAAT T GACTCAGCTAs GkATTGr TTTAT CT C 
AGTTTGACCCAGACCGAGTCGGTATTTTATTGkATGCAGCAGGTCGTyyT 
CGTTTAwAwAATGCAGACCTTGCTTCACTAGGTGGTTATCCCAAAGCCTC 
GGTAACTCAACTAGCCCTTGCGACAGAACTACTCCAAATGGGACTAAGTC 
ATGAAAAGGTTGAATTTTTCTTTGGTAGCCAGCTTTCCATTGAAGAGCTG 
CGAC AAGT T G C CT ACG C C TT T TT AC AC C AAG AAC T C AG C AGAG AAG AT G C 
GGAG C AAT T T GAAAAAGAT AAAG G T AAT C AG C C AG AT T T AACT C T C AG AG 
AT T G GAAAAGC AAG CT AGAGAAAGC T GAG G GAAAAG AAGT AGT T GAT G AA 
G AAT T CGCG GAAAAT C C ACT G GT T C AGAGAGT AT T GG AC ACT T AT C C T CT 
GGGGTCATTGGTTTCCTATAAGGGACAGGACTTTGAGGTCATGTCGGTCA 
GCGATGCTCGAtTGAACGGTTTGATTCGGATTGAGTTAGTCAATGACTTT 
T C G GAT AT CAT T G AAC AAAAT C C AGT T CT T TAT GT GAG G AC C T GGG AAG A 
AGTCAGTCAGGCACTTCATCAGCCAAAGGCAGAACCACAAACAGAGTTAG 
AAG AAGC GG AC C AAGAAT T AAAC CT AT T CT C AT T T C T G GAAG AG G AGC T A 
GTTCAGAGTATTGGACTATTGGAACCAGATGATTCAGAAAATGGTCATAA 
CG AT AC T GAT C T T GAAG AAAC AG AT AAT C AAATT C C T G AAG AGGAAG T CG 
T C GAAAC AAT T C C AGAG AT T C C AGT AACGG AC TT T T AT T T T C C AGAAG AT 
TTGACGGACTTTTATCCTAAGACTGCTAGAGATAAGGTTGAGACAAACAT 
TGTGGCCATTCGTTTGGTAAAAAATCTAGAAGTAGAGCACCGCAATGCTT 
CACCAAGTGAACAAGAACTCCTTGCCAAGTATGTAGGCTGGGGTGGACTA 
GC C AAT G AAT T T T T T GAT G AC TAT AAT C C AAAAT T T T CT AAGG AAC GAGA 
AGAAC T GAAGAG C CT AGT C AC AG AT AAAG AGT AT T C G GAT AT G AAACAGT 
CCTCCCTGACAGCCTATTACACAGACCCATCCCTGATCCGTCAGATGTGG 
GAT AAGT T GGAAAG AG AT G G C T T T AC AGGT GGC AAAAT C C TAG AT C CT T C 
CATGGGAACAGGGAATTTCTTTGCGGCTATGCCAAAACACTTAAGAGAAA 
AG AG T G AGT T G TAT G G CGT AG AG T TAG AT AC TAT T AC AG GAG C TAT T G CC 
AAAC AC C T T CAT C C C AAT AGT CAT AT T G AAAT T AAG G GAT T T G AG AC GGT 
GGCTTTTAACGACAATAGTTTTGATTTGGTGATTTCAAATGTGCCCTTTG 
C C AAT AT ACGAAT T G C GGAT AAT AG GT ACG AT AGG C C T T AC AT GAT T C AT 
GACTACTTTGTCAAAAAGTCACTTGATTTGCTTCATGATGGTGGACAAGT 
AGCG AT T AT C T C T T C C AC AGG AAC TAT G GAT AAG C G AAC AG AAAAC AT CT 
TACAAGATATTCGTGAGACAACTGAATTTCTTGGTGGGGTTCGACTGCCT 
GAC TCTGCCTT T AAG G CC AT T G C AG G AACGAGT GT C AC AAC GGAT AT GT T 
AT T CT T C C AG AAAC AC T T AG AC AAGGG AT AT GT GG C AGAC GAT T T AGC CT 
TTTCAGGTTCCATTCGCTATGACAAGGATAGTCGCATTTGGCTCAATCCT 
TATTTTGATGGAGAATACAATAGCCAGGTGCTAGGAACCTACGAGGTCAG 
G AAT T T T AAC GGAG GAAC AC T T T C T GT T AAGGGG AC T AGT GAT GAC T T G A 
T T G C AAGT GT T GAAAC AG C T C T AAAT C ACGT T AAGG C C C C AAGAG AG AT T 
GAT AG AAAT GAGGT CAT CAT T AAC CCAGATGTGTT GAC C AAAC AAGT C AA 
T GAT AC C T C CAT T C C AG C T G AAAT GAG G GAAAAT CT AGG T C AGT AC AG T T 
T T GGT TAT C AGGG GT C T AC AG T T T AC TAT C GAGAT AAC AAAG G CAT T C G A 
GT C G G AAC C AAG AC G G AAGAAAT C AGT T AC TAT GT C GAT GAAGAG 

SEQ ID. NO. 7003 

STRAIN 18RS21 

GnAGGGAAAATGAATCAAGAAGTCTTACTACAAATGATGAGA 
GCCACTATTCCTCGTGATAGAGCCTTGCTTGAGGCATTTTTATATTACCA 
AGC AG AG CAT T T T GAT GAGG AGT G GGAT AGT C T T AT T CAT C AG T T TAT G A 
CC AAT AGGC AAG AAAT AAAT AAGT CT GTTC AAGT ACTTC ACT TTGAG AC A 
GATGTTTCAGCTTTTGTCCAGGCTAGTCCTTATGATACTGCTCATGATCT 
ATTGACCTATACACAAGTTTTCGGCCAAAGTGGTCTTCAAAAACTAGATA 
AACTATCGCCGTCTGAAAAAAACTTGGTGATAGAAGTGGCCTTGTTCAAT 
CTGGCCACTCGTTTTCAATTATTGGATTCCAATGGACACTACCAAACCAT 
AT C GC CGG AT T C AC T C T T AC a AAAGAGT AG GGG AG C T AAT T T G G T C AAT G 
T GT AT CGTGTGGC T AAT AAT T T AG C G GAT C GT AT T AG T C GAG AT AT T G AA 
CAGTTTCTCTTAACTTACGAGCCTGAGCTTGAAACTAGAGCTGATGAAAC 
TGTTCTAGAAAATGAAGAAACTGTTGATGAGCACAAAACAAGTGTTCATC 
AAGCAATATCTTTTCGAGAAGAGGGCTCTCTGGTTATTGCTAGTTTGGAT 
GT AG ATT T GT C T C AACT AG AT GT T C AAAT AGGAAAAAC C AGT CAT CT GC C 
AG CT TAT GAAG AGT TAT C CT T AC G AC GT AAAT T T GAGAT T C T AAC AT AT T 
T T G ACC AAAT T C G AAAT GAAC GT T C C AAAG T C C C AAG T T T TAG AC GAGGT 
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GAT T T T GAC AC AG AG AT GG AAAT G AC AC C AGT CT T T GAT G GC G AG G AAT T 
ACT TAG T TAT C T C GAAG C T GAT G G C AGT C C CT AT GAG CT GAAACG AACGC 
T G ACT AC AG t c G AAG AAAAGG AAT T AG AAAAAAT T GG AC AAG C CAT T AGG 
AT AG AAAAT C AAGAAAAAT T GAC T C AG CT AGGG AT T GAT T T AT C T C AGT T 
T GAC C C AG AC C GAGT C G GT AT T T TAT T G GAT G C AG C AGGT CGTTTTCGTT 
TAAAAAATGCAGACCTTGCTTTACTAGGTGGTTATCCCAAAGCCTCGGTA 
ACT C AACT AG CC CT T G C G AC AGAACT AC T C C AAAT GGG ACT AAGT CAT GA 
AAAGGTTGAATTTTTCTTTGGTAGCCAGCTTTCCATTGAAGAGCTGCGAC 
AAG T T GCC T ACG C C T T T T T AC AC C AAG AACT C AG C AG AG AAG AT G C GG AG 
CAATTTGAAAAAGATAAAGGT AAT CAGCCAGAT TT AACT CT CAGAGAT T G 
G AAAAG C AAG C T AG AG AAAG CT G AGGGAAAAG AAGT AGT T GAT GAAG AAT 
TCGCGGAAAATCCACTGGTTCAGAGAGTATTGGACACTTATCCTCTGGGG 
T CAT T G GT T T C CT AT AAG GG AC AG GAC T T T G AGGT CAT GT C G GT C AG C G A 
TGCTCGATTGAACGGTTTGATTCGGATTGAGTTAGTCAATGACTTTTCGg 
AT AT CATTGAACAAAAT C CAGT TC t TT At GT GAGGACCT GGG AAGAAGTC 
AGT C AGG C AC T T CAT C AG C C AAAGG C AG AAC C AC AAAC AG AGT TAG AAG A 
AG CGG AC C AAGAAT T AAAC C TAT T CT CAT T T C T G GAAG AGG AG C C AGT T C 
AG AGT AT T G G ACT AT T GG AAC C AG a T GAT T C AGAAAAT G GT CAT AAC GAT 
ACT GAT CTT GAAGAAACAGAT AAT CAAATT CCTGAAGAGGAAGT CGT CGA 
AAC AAT T C CAGAGAT T C C AGT AACGG ACT TT T AT T T T C C AGAAGAT T T GA 
C GG ACT T T T AT C C T AAG ACT GC T AG AG AT AAG GT T GAGAC AAAC AT T GT G 
G C CAT T C GT T T GGT AAAAAAT C T AGAAGT AG AG C AC CG C AAT G C T T C AC C 
AAGTGAACAAGAACTCCTTGCCAAGTATGTAGGCTGGGGTGGACTAGCCA 
AT G AAT T T T T T GAT GAC TAT AAT C C AAAAT T T T C T AAGG a AC G AG AAGAA 
CT GAAG AG CC T AGT C AC AG AT AAAG AGT AT T C GGAT AT GAAAC AGT C C T C 
C C T GAC AGC C T AT T AC AC AG AC C CAT C C CT GAT C C GT C AGAT GT GGG AT A 
AGT TGG AAAG AG ATGGCTTT AC AGGTGGC AAAAT CCTAGATCCTTCCATG 
GGAACAGGGAATTT CTT TGCGG CT ATGC CAAAAC ACTT AAG AGAAAAGAG 
TGAGTTGTATGGCGTAGAGTTAGATACTATTACAGGAGCTATTGCCAAAC 
ACCTTCATCCCAATAGTCATATTGAAATTAAGGGATTTGAGACGGTGGCT 
TTTAACGACAATAGTTTTGATTTGGTGATTTCAAATGTGCCCTTTGCCAA 
TATACGAATTGCGGATAATAGGTACGATAGGCCTTACATGATTCATGACT 
ACTTTGTCAAAAAGTCACTTGATTTGCTTCATGATGGTGGACAAGTAGCG 
ATTATCTCTTCCACAGGAACTATGGATAAGCGAACAGAAAACATCTTACA 
AGATATTCGTGAGACAACTGAATTTCTTGGTGGGGTTCGACTGCCTGACT 
CTGCCTTTAAGGCCATTGCAGGAACGAGTGTCACAACGGATATGTTATTC 
T T C CAG AAAC AC T TAG AC AAGG GAT AT G T GG C AG ACGAT T T AG C C T T T T C 
AGGTTCCATTCGCTATGACAAGGATAGTCGCATTTGGCTCAATCCTTATT 
TTGATGGAGAATACAATAGCCAGGTGCTAGGAACCTACGAGGTCAGGAAT 
T T T AAC GGAG G AAC ACT T T CT GTT AAGGG G AC T AGT GAT G ACT T GAT T G C 
AAGTGTTGAAACAGCTCTAAATCACGTTAAGGCCCCAAGAGAGATTGATA 
GAAAT GAGGT CAT C ATT AACCCAGATGT GTT GAC C AAAC AAGT CAAT GAT 
ACCTCCATTCCAGCTGAAATGAGGGAAAATCTAGGTCAGTACAGTTTTGG 
TTATCAGGGGTCTACAGTTTACTATCGAGATAACAAAGGCATTCGAGTCG 
G AAC C AAG AC G GAAG AAAT CAGT T AC TAT GT C GAT GAAG AG 



SEQ ID. NO. 7004 
STRAIN H36B frame: 1 

GGKMNQEVLLQMMRATIPRDRALLEAFLYYQAEHFDEEWDSLIHQFMTNRQEINKSVQVL 
HFETDVSAFVQASPYDTAHDLLTYTQVFGQSGLQKLDKLSPSEKNLVIEVALFNLATRFQ 
LLDSNGHYQTISPDSLLQKSRGANLVNVYRVANNLADRISRDIEQFLLTYEPELETRADE 
TVLENEETVDEHKTSVHQAISFREEGSLVIASLDVDLSQLDVQIGKTSHLPAYEELSLRR 
KFEILTYFDQIRNERSKVPSFRRGDFDTEMEMTPVFDGEELLTYLEADGSPYELKRTLTT 
VEEKELEKIGQAIRIENQEKLTQLXIXLSQFDPDRVGILLXAAGRXRLXNADLASLGGYP 
KASVTQLALATELLQMGLSHEKVEFFFGSQLSIEELRQVAYAFLHQELSREDAEQFEKDK 
GNQPDLTLRDWKSKLEPCAEGKEVVDEEFAENPLVQRVLDTYPLGSLVSYKGQDFEVMSVS 
DARLNGLIRIELVNDFSDIIEQNPVLYVRTWEEVSQALHQPKAEPQTELEEADQELNLFS 
FLEEELVQSIGLLEPDDSENGHNDTDLEETDNQIPEEEWETIPEIPVTDFYFPEDLTDF 
YPKTARDKVETNIVAIRLVKNLEVEHRNASPSEQELLAKYVGWGGLANEFFDDYNPKFSK 
EREELKSLVTDKEYSDMKQSSLTAYYTDPSLIRQMWDKLERDGFTGGKILDPSMGTGNFF 
AAMPKHLREKSELYGVELDTITGAIAKHLHPNSHIEIKGFETVAFNDNSFDLVISNVPFA 
NIRIADNRYDRPYMIHDYFVKKSLDLLHDGGQVAIISSTGTMDKRTENILQDIRETTEFL 
GGVRLPDSAFKAIAGTSVTTDMLFFQKHLDKGYVADDLAFSGSIRYDKDSRIWLNPYFDG 
EYNSQVLGTYEVRNFNGGTLSVKGTSDDLIASVETALNHVKAPREIDRNEVIINPDVLTK 



295 



WO 2004/018646 



PCT/US2003/026827 



SEQUENCE LISTING 



QVNDTSIPAEMRENLGQYSFGYQGSTVYYRDNKGIRVGTKTEEISYYVDEE 

SEQ ID. NO. 7005 

STRAIN 18RS21 frame: 1 

XGKMNQEVLLQMMRATIPRDRALLEAFLYYQAEHFDEEWDSLIHQFMTNRQEINKSVQVL 
HFETDVSAFVQASPYDTAHDLLTYTQVFGQSGLQKLDKLSPSEKNLVIEVALFNLATRFQ 
LLDSNGHYQTISPDSLLQKSRGANLVNVYRVANNLADRISRDIEQFLLTYEPELETRADE 
TVLENEETVDEHKTSVHQAISFREEGSLVIASLDVDLSQLDVQIGKTSHLPAYEELSLRR 
KFEILTYFDQIRNERSKVPSFRRGDFDTEMEMTPVFDGEELLTYLEADGSPYELKRTLTT 
VEEKELEKIGQAIRIENQEKLTQLGIDLSQFDPDRVGILLDAAGRFRLKNADLALLGGYP 
KASVTQLALATELLQMGLSHEKVEFFFGSQLSIEELRQVAYAFLHQELSREDAEQFEKDK 
GNQPDLTLRDWKSKLEKAEGKEVVDEEFAENPLVQRVLDTYPLGSLVSYKGQDFEVMSVS 
DARLNGLIRIELVNDFSDIIEQNPVLYVRTWEEVSQALHQPKAEPQTELEEADQELNLFS 
FLEEEPVQSIGLLEPDDSENGHNDTDLEETDNQIPEEEWETIPEIPVTDFYFPEDLTDF 
YPKTARDKVETNIVAIRLVKNLEVEHRNASPSEQELLAKYVGWGGLANEFFDDYNPKFSK 
EREELKSLVTDKEYSDMKQSSLTAYYTDPSLIRQMWDKLERDGFTGGKILDPSMGTGNFF 
AAMPKHLREKSELYGVELDTITGAIAKHLHPNSHIEIKGFETVAFNDNSFDLVISNVPFA 
NIRIADNRYDRPYMIHDYFVKKSLDLLHDGGQVAIISSTGTMDKRTENILQDIRETTEFL 
GGVRLPDSAFKAIAGTSVTTDMLFFQKHLDKGYVADDLAFSGSIRYDKDSRIWLNPYFDG 
EYNSQVLGTYEVRNFNGGTLSVKGTSDDLIASVETALNHVKAPREIDRNEVIINPDVLTK 
QVNDTSIPAEMRENLGQYSFGYQGSTVYYRDNKGIRVGTKTEEISYYVDEE 

SEQ ID. NO. 7006 

STRAIN 2603 frame: 1 

GGKMNQEVLLQMMRATIPRDRALLEAFLYYQAEHFDEEWDSLIHQFMTNRQEINKSVQVL 
HFETDVSAFVQAS PYDTAHDLLTYTQVFGQSGLQKLDKLSPSEKNLVIEVALFNLATRFQ 

lldsnghyqtispdsllqksrganlvnvyrvannladrisrdieqflltyepeletra.de 
tvleneetvdehktsvhqaisfreegslviasldvdlsqldvqigktshlpayeelslrr 
kfeiltyfdqirnerskvpsfrrgdfdtememtpvfdgeelltyleadgspyelkrtltt 
veekelekigqairienqekltqlgidlsqfdpdrvgilldaagrfrlknadlallggyp 
kasvtqlalatellqmglshekvefffgsqlsieelrqvayaflyqelsredaeqfekdk 
gnqpdltlrdwksklekaegkewdeefaenplvqrvldtyplgslvsykgqdfevmsvs 
darlnglirielvndfsdiieqnpvlyvrtweevsqalhqpkaepqteleeadqelnlfs 
fleeepvqsigllepddsenghndtdleetdnqipeeewetipeipvtdfyfpedltdf 
ypktardkvetnivairlvknlevehrnaspseqellakyvgwgglaneffddynpkfsk 
ereelkslvtdkeysdmkqssltayytdpslirqmwdklerdgftggkildpsmgtgnff 
aampkhlrekselygveldt itgaiakhlhpnshie ikgfet vafndns fdlvi snvp fa 
niriadnrydrpymihdyfvkksldllhdggqvaiisstgtmdkrtenilqdirettefl 
ggvrlpdsafkaiagtsvttdmlffqkhldkgyvaddlafsgsirydkdsriwlnpyfdg 

EYNSQVLGTYEVRNFNGGTLSVKGTSDDLIASVETALNHVKAPREIDRNEVIINPDVLTK 
QVNDT S I PAEMRENLGQYS FG YQG S T V Y YR DNKG I RVGT KTEE I S YYVDEE 

SEQ ID NO. 7101 
STRAIN 2603 

ATGAAAAAGAAAATTATTTTGAAAAGTAGTGTTCTTGGTTTAGTCGCTGGGACTTCTATT 
ATGTTCTCAAGCGTGTTCGCGGACCAAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTT 
CATGGTGCACTTGACAATACTGGAACAGCAAATATGCCTGATGGAAAAGTTGCTAATGCT 
GGT ACT GCT GCT C AAT TAG AT G C T TAT AT G GAT G AC G C T C AAAAAG AT T T C AAAC AAAC T 
AAC C C T AAT G GT G AAAG CAT T AGG GT T C AAG C AGG C GAT AT G GT T G GAG C AAGT C C AG C C 
AACTCTGGGCTTCTTCAAGATGAACCAACTGTCAAAAATTTTAATGCAATGAATGTTGAG 
TAT G G C AC AT T GGGT AAC CAT G AAT T T GAT G AAGGGT T G G C AG AAT AT AAT C GT AT CGT T 
ACTGGTAAAGCCCCTGCTCCAGATTCTAATATTAATAATATTACGAAATCATACCCACAT 
GAAGCTGCAAAACAAGAAATTGTAGTGGCAAATGTTATTGATAAAGTTAACAAACAAATT 
CCTTACAATTGGAAGCCTTACGCTATTAAAAATATTCCTGTAAATAACAAAAGTGTGAAC 
GTTGGCTTTATCGGGATTGTCACCAAAGACATCCCAAACCTTGTCTTACGTAAAAATTAT 
G AAC AAT AT GAAT T T T TAG AT GAAG C T GAAAC AAT CGT T AAAT AC G C C AAAG AAT T AC AA 
GCTAAAAATGTCAAAGCTATTGTAGTTCTCGCACATGTACCTGCAACAAGTAAAAATGAT 
AT T G CT G AAGGT GAAG C AG C AG AAAT GAT G AAAAAAGT C AAT C AAC T CT T CC CT G AAAAT 
AGCGTAGATATTGTCTTTGCTGGACACAATCATCAATATACAAATGGTCTTGTTGGTAAA 
ACTCGTATTGTACAAGCGCTCTCTCAAGGAAAAGCCTATGCTGATGTACGTGGTGTCTTA 
GAT AC T GAT AC AC AAG AT T T CAT T GAG AC C C C T T C AG C T AAAGT AAT T G C AGT T G C T C CT 
G G T AAAAAAAC AGGT AGT G C C GAT AT T C AAG C CAT T G T T G AC C AAGC T AAT AC TAT CGT T 
AAAC AAGT AAC AG AAG CT AAAAT TGGTACTGCC G AGGT AAGT GT CAT GAT T ACG C GT T CT 
GTTGATCAAGATAATGTTAGTCCGGTAGGCAGCCTCATCACAGAGGCTCAACTAGCAATT 
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GCTCGAAAAAGCTGGCCAGATATCGATTTTGCCATGACAAATAATGGTGGCATTCGTGCT 
GACTTACTCATCAAACCAGATGGAACAATCACCTGGGGAGCTGCACAAGCAGTTCAACCT 
TTT GGTAAT AT CTT ACAAGT CGTCGAAAT T ACT GGT AGAGATCTTT AT AAAGCACT CAAC 
GAACAATACGACCAAAAACAAAATTTCTTCCTTCAAATAGCTGGTCTGCGATACACTTAC 
ACAGATAATAAAGAGGGCGGGGAAGAAACACCATTTAAAGTTGTAAAAGCTTATAAATCA 
AAT GGT GAG GAAAT C AAT C C T GAT G C AAAAT AC AAAT T AGT TAT C AAT G ACT T T T T AT T C 
GGTGGTGGTGATGGCTTTGCAAGCTTCAGAAATGCCAAACTTCTAGGAGCCATTAACCCC 
GATACAGAGGTATTTATGGCCTATATCACTGATTTAGAAAAAGCTGGTAAAAAAGTGAGC 
GT T C C AAAT AAT AAAC C T AAAAT C T AT GT C ACT AT G AAGAT G GT T AAT G AAACT AT T AC A 
C AAAAT GAT G GT ACAC AT AGC AT T AT T AAG AAACT T T AT T TAG AT C G AC AAGGAAAT AT T 
GT AG C AC AAGAG AT T G T AT C AG AC ACT T TAAACC AAAC AAAAT C AAAAT C T AC AAAAAT C 
AAC C CT GT AACT AC AAT T C AC AAAAAAC AAT T AC AC C AAT T T AC AG C T AT T AAC CCT AT G 
AG AAAT TAT GG C AAAC CAT C AAACT C C ACT AC T GT AAAAT C AAAAC AAT T AC C AAAAAC A 
AACTCTGAATATGGACAATCATTCCTTATGTCTGTCTTTGGTGTTGGACTTATAGGAATT 
GCTTT AAAT ACAAAGAAAAAACAT AT GAAA 

SEQ ID NO. 7102 

STRAIN 0 90 

AAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGCACTTGAC 
AAT ACT GGAAC AGC AAAT AT G C C T G AC GG AAAAGT T AC T AAT G CT GG C AC 
T G C T GC T C AAT TAG AT G CT T AT AT G GAT GAT GCT C AAAAAG AT T T C AAAC 
AAACTAACCCTAATGGTGAAAGCATTAGAGTTCAAGCTGGTGATATGGTT 
G GAG C AAGT C C AGC T AAC T C AGGG CT T CTT C AAG AT G AAC CAAC C GT T AA 
AACATTTAATGCAATGAATGTTGAGTATGGCACATTAGGTAACCATGAAT 
TTGATGAAGGTTTGGCAGAATACAATCGTATCGTTACTGGAAAGGCCCCT 
G C T C C AG AT T C T AAT AT AAAT AAT AT T AC GAAAT CAT AC C C AC ACG AAG C 
T G C AAAAC AAGAAAT TGT AGT GG C AAAC G T T AT T G AT AAAGT T AAC AAAC 
AAATCCCTTACAATTGGAAACCTTACGCTATTAAAAATATTCCTGTAAAT 
AAC AAAAGT GT G AAC GT T GGC T T T AT C GG AAT C G T T AC C AAAG AC AT CCC 
AAACCTTGTCTTACGTAAAAATTATGAACAATATGAATTTTTAGATGAAG 
CTGAAACAAT CGTT AAAT ACGC CAAAGAATT ACAAGCTAAAAAT GT C AAG 
G C T AT T GT AG TCCTTGCT CAT GT AC CT G C AAC AAG C AAGG AT GAT AT T GC 
T G AAGGT G AAG C AG C AG AAAT GAT GAAAAAAG T C AAT C AACT CTTCCCTG 
AAAATAGCGTAGATATTGTCTTTGCTGGACACAATCATCAATATACAAAT 
GGT CTTGTT GGT AAAACTCGC ATT GTACAAGCGCTCTCTC AAGG AAAAGC 
C TAT GCT G AC GT AC GT GGT GT C CT AG AT ACT GAT AC AC AAG AT T T CAT T G 
AAAC CCCTTCAGCTAAAGTAGTTGCAGTTGCT CCT GGT AAAAAAACAGGT 
AGTGCCGATATTCAAGCCATTGTTGACCAAGCTAATACTATCGTTAAACA 
AGTAACAGAAGCTAAAATTGGTACTGCCGAGGTAAGTGGCATGATTACGC 
GTTCTGTTGATCAAGATAATGTTAGTCCAGTAGGCAGCCTCATCACAGAG 
GCTCAACTAGCAATTGCTCGAAAAAGCTGGCCAGATATCGATTTTGCCAT 
G AC AAAT AAT G GT GG CAT T CGT G CT G AC T T AC T CAT C AAAC C AGAT G G AA 
CAATCACCTGGGGAGCTGCACAAGCAGTTCAACCTTTTGGTAATATCTTA 
CAAGTCGTCGAAATTACTGGTAGAGATCTTTATAAAGCACTCAACGAACA 
ATACGACCAAAAACAAAATTTCTTCCTTCAAATAGCTGGTCTGCGATACA 
CT T AC AC AG AT AAT AAAG AG G G C G GAG AAG AAAC AC CAT T T AAAGT T GT A 
AAAG C T TAT AAAT C AAAT GGT GAAG AAAT C AAT CCT GAT G C AAAAT AC AA 
ATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTTGCAAGCT 
T C AGAAAT G C C AAAC T T C T AG GAG C CAT T AAT CCC GAT AC AG AG GT AT T T 
AT GG C C TAT AT C ACT GAT T T AGAAAAAGC T GGT AAAAAAGT G AG CGT T C C 
AAATAATAAACCTAAAATCTATGTCACTATGAAGATGGTTAATGAAACTA 
TTACACAAAATGATGGTACACATAGCATTATTAAGAAACTTTATTTAGAT 
C G AC AAGG AAAT AT T GT AG C AC AAG AG AT T GT AT C AG AC AC T T T AAAC C A 
AAC AAAAT C AAAAT CT AC AAAAAT CAACCCTGT AACT AC AATTC AC AAAA 
AAC AAT T AC AC C AAT T T AC AG C T AT T AAC C C T AT GAG AAAT TAT G GC AAA 
C CATC AAACT CC ACT ACT GT AAAAT C AAAAC AA 

SEQ ID NO. 7103 

STRAIN A90 9 

GCGTCAATGACTTTCATGGTGCaCTTGACAATACTGGAACAGCAAATATG 
CCTGACGGAAAAGTTACTAATGCTGGCACTGCTGCTCAATTAGATGCTTA 
T AT GG AT GAT GCT C AAAAAG AT T T C AAAC AAAC T AAC C C T AAT GGT GAAA 
G CAT TAG AGT T C AAG C T G GT G AT AT G GT T GG AG C AAGT C C AG CT AAC T C A 
GGGCTTCTTCAAGATGAACCAACCGTTAAAACATTTAATGCAATGAATGT 
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T G AGT AT G G C AC AT TAG G T AAC CAT GAAT T T GAT GAAGGT T T G GC AGAAT 
ACAATCGTATCGTTACTGGAAAGGCCCCTGCTCCaGaTTCTAATATAAAT 
AATATTACGAAATCATACCCACACGAAGCTGCAAAACAAGAAATTGTAGT 
GGCAAACGTTATTGATAAAGTTAACAAACAAATCCCTTACAATTGGAAAC 
C T T AC AC TAT T AAAAAT AT T C CT GT AAAT AAC AAAAG T G TG AACGT T GG C 
T T TAT CG GAAT CGT T AC C AAAG AC AT C C C AAAC C T T GT C TT AC GT AAAAA 
T TAT GAAC AAT AT GAAT T T T TAG AT G AAG CT GAAAC AAT CGT T AAAT ACG 
CCAAAGAATTACAAGCTAAAAATGTCAAGGCTATTGTAGTCCTTGCTCAT 
GT AC C T G C AAC AAGC AAG GAT GAT AT T G C T GAAGGT GAAGC AG C AG AAAT 
GATGAAAAAAGTCAATCAACTCTTCCCTGAAAATAGCGTAGATATTGTCT 
T TG CT G GAC AC AAT CAT C AAT AT AC AAAT GGTCTTGTT GGT AAAAC T C GT 
ATTGTACAAGCGCTCTCTCAAGGAAAAGCCTATGCTGATGTACGTGGTGT 
CCTAGATACTGATACACAAGATTTCATTGAAACCCCTTCAGCTAAAGTAA 
TTGCAGTTGCTCCTGGTAAAAAAACAGGTAGTGCCGATATTCAAGCCATT 
GT T GAC C AAG C T AAT ACT AT C GT T AAAC AAG T AAC AG AAGC T AAAAT T GG 
TACTGCCGAGGTAAGTGGCATGATTACGCGTTCTGTTGATCAAGATAATG 
T T AGT C C G GT AGG C AG C CT CAT C AC AGAG G CT C AACT AG C AAT T G C T C G A 
AAAAGC T G G C C AGAT AT C GAT T T T G C CAT GAC AAAT AAT GGT GG CAT T CG 
T G C T GAC T T AC T CAT C AAAC C AG AT GG AAC AAT C AC C T GGGG AGC T G C AC 
AAG C AGT T C AAC CT T T T GG T AAT AT CT T AC AAG T C GT C GAAAT TACT G GT 
AGAGAT CTT TAT AAAGCACT CAACGAAC AAT ACG AC C AAAAAC AAAAT TT 
CT T C C T T C AAAT AG C T GGT CT GCG AT AC AC T T AC AC AG AT AAT AAAG AGG 
GCGGGGAAGAAACACCATTTAAAGTTGTAAAAGCTTATAAATC AAAT GGT 
G AGG AAAT C AAT CC T GAT GC AAAAT AC AAAT T AGT TAT C AAT GAC TT T T T 
ATTCGGTGGTGGTGATGGCTTTGCAAGCTTCAGAAATGCCAAACTTCTAG 
GAG C C AT T AAT C C C GAT AC AGAGGT AT T TAT G G C C TAT AT C AC T GAT T T A 
GAAAAAGCT GGT AAAAAAGTGAGCGTTCCAAAT AAT AAAC CT AAAAT CT A 
T GT C AC TAT G AAG AT GGT T AAT GAAACT AT T AC AC AAAAT GAT G GT AC AT 
AT AG CAT TAT T AAG AAAC T T TAT T T AGAT C G AC AAGG AAAT AT T GT AGC A 
C AAG AG AT T GT AT C AGAC ACT T T AAAC C AAAC AAAAT C AAAAT C T AC AAA 
AAT C AAC C C T GT AAC T AC AAT T C AC AAAAAAC AAT TAG AC C AAT T T AC AG 
CT AT T AAC C C T AT GAG AAAT TAT G G C AAAC CAT C AAAC T C C ACT ACT GT A 
AAAT C AAAAC AA 

SEQ ID NO. 7104 

STRAIN H3 6B 

CCAAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGCACTTG 
AC AAT ACT GG AAC AG C AAAT AT G CC T G ACG G AAAAGT T AC T AAT G C T G G C 
ACTGCTGCTCAATTAGATGCTTATATGGATGATGCTCAAAAAGATTTCAA 
AC AAACT AAC C C T AAT G GT GAAAG CAT TAG AGT T C AAG CT G G T GAT AT G G 
T T GGAG C AAGT C C AG CT AAC T C AG GG CT T CT T C AAG AT GAAC C AAC CGT T 
AAAAC AT T T AAT GC AAT GAAT GT T G AGT AT GGC AC ATT AG GT AAC C AT GA 
ATTTGATGAAGGTTTGGCAGAATACAATCGTATCGTTACTGGAAAGGCCC 
CT GCT C CAGAT T CT AAT AT AAAT AAT ATT ACGAAAT CAT ACCCACACGAA 
GCTGC AAAAC AAGAAAT T GT AGT GGC AAACGT TAT T GAT AAAGT TAACAA 
AC AAAT CCCTT AC AATTGGAAACCTTACACT ATT AAAAAT ATT CCTGTAA 
AT AAC AAAAGT GT GAAC G T T GGCT T T AT CGGAAT C GT T AC C AAAG AC AT C 
C C AAAC CT T GT CT T AC G T AAAAAT TAT GAAC AAT AT GAAT T T T TAG AT G A 
AGCT GAAAC AAT CGT T AAAT AC G C C AAAG AAT T AC AAGC T AAAAAT GT C A 
AG G CT AT T G T AGT C C T T G CT C AT GT AC C T G C AAC AAG C AAG GAT GAT AT T 
GCT GAAGGT GAAG C AG C AG AAAT GAT G AAAAAAGT C AAT C AACT CT T C C C 
T G AAAAT AG C G T AGAT AT TGTCTTTGCTG GAC AC AAT CAT C AAT AT AC AA 
AT GGTCTTGTT GGT AAAACT CGT AT TGTACAAGCGCTCTCTCAAGGAAAA 
G C C TAT G CT GAT GT ACGT GGT G T C C TAG AT ACT GAT AC AC AAG AT T T CAT 
TGAAACCCCTTCAGCTAAAGTAATTGCAGTTGCTCCTGGTAAAAAAACAG 
GTAGTGCCGATATTCAAGCCATTGTTGACCAAGCTAATACTATCGTTAAA 
CAAGTAACAGAAGCTAAAATTGGTACTGCCGAGGTAAGTGGCATGATTAC 
GCGTTCTGTTGATCAAGATAATGTTAGTCCGGTAGGCAGCCTCATCACAG 
AGGCTC AACT AGC AAT T GCT CGAAAAAGCT GGC CAGAT AT CGATTTTGCC 
AT GAC AAAT AAT G G T GG CAT T C G T G CT GAC T T ACT CAT C AAAC CAGAT G G 
AAC AAT C AC C T GG G GAG C T G C AC AAG C AGT T C AAC CT T T T G GT AAT AT C T 
TACAAGT CGT CGAAATT ACT GGT AGAGAT CTTT AT AAAGC ACT C AAC GAA 
CAATACGACC AAAAAC AAAATTTCTTCCTTC AAAT AGCTGGTCTGCGAT A 
CACTTACACAGATAATAAAGAGGGCGGGGAAGAAACACCATTTAAAGTTG 
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TAAAAGCTTATAAATCAAATGGTGAGGAAATCAATCCTGATGCAAAATAC 
AAATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTTGCAAG 
CT T C AG AAAT GC C AAACT T C T AGG AG C C AT T AAT C C C G AT ACAG AGGT AT 
T T AT GGC C TAT AT C ACT GAT T T AGAAAAAGCT GGT AAAAAAGT G AG C G T T 
CCAAATAATAAACCTAAAAT CTATGT CACTATGAAGATGGTTAAT GAAAC 
TATTACACAAAATGATGGTACATATAGCATTATTAAGAAACTTTATTTAG 
AT C G AC AAGG AAAT AT T GT AG C AC AAG AG AT T GT AT C AG AC ACT T T AAAC 
C AAAC AAAAT CAAAAT CT AC AAAAAT C AACCCTGT AACT AC AAT T C AC AA 
AAAAC AAT T AC AC C AAT T T AC AG CT AT T AAC C C T AT GAG AAAT TAT G G C A 
AAC CAT CAAACTCCACTACTGT AAAAT CAAA 

SEQ ID NO. 7105 

STRAIN 18RS21 

G AC C AAGT C GGT GT C C AAGT TAT AG G C GT C AAT G ACT T T C 
AT GGT GC AC T T G AC AAT ACT G G AAC AGC AAAT AT G C C T G AC G G AAAAGT T 
AnT AAT GCT GGC ACT G C T G CT C AAT TAG AT G CT T AT AT G GAT G AT GC T C A 
AAAAG AT T T C AAAC AAAC T AAC C CT AAT GGT G AAAG CAT T AGAGT T C AAG 
CTGGTGATATGGTTGGAGCAAGTCCAGCTAACTCAGGGCTTCTTCAAGAT 
G AAC C AAC C GT T AAAAC AT T T AAT G CAAT G AAT GT T G AGT AT GGC AC AT T 
AGGTAACCATGAATTTGATGAAGGTTTGGCAGAATACAATCGTATCGTTA 
C T G G AAAGG C CC CT GCT CCAGAT T C T AAT AT AAAT AAT AT T ACGAAAT C A 
T AC C C AC AC G AAGC T G C AAAAC AAG AAAT T GT AGT G G C AAAC G T T AT T G A 
T AAAGT T AAC AAAC AAAT C C CT T AC AAT T G GAAAC C T T AC ACT AT T AAAA 
ATATTCCTGTAAATAACAAAAGTGTGAACGTTGGCTTTATCGGAATCGTT 
AC C AAAG AC AT C C C AAAC CT T GT CT T AC GT AAAAAT TAT G AAC AAT AT GA 
AT T T T T AGAT G AAG CT GAAAC AAT C G T T AAAT AC G C C AAAG AAT T AC AAG 
CT AAAAAT GT C AAGGC T ATT GT AGT CCTTGCTC AT GTACCTGC AAC AAGC 
AAG GAT GAT AT T G C T G AAGGT GAAG C AG C AG AAAT GAT G AAAAAAG T C AA 
TCAACTCTTCCCTGAAAATAGCGTAGATATTGTCTTTGCTGGACACAATC 
AT CAAT AT AC AAAT GGTCTTGTT GG T AAAAC T CG T AT T G TAG AAG C G CT C 
TCTCAAGGAAAAGCCTATGCTGATGTACGTGGTGTCCTAGATACTGATAC 
ACAAGATTTCATTGAAACCCCTTCAGCTAAAGTAATTGCAGTTGCTCCTG 
GTAAAAAAACAGGTAGTGCCGATATTCAAGCCATTGTTGACCAAGCTAAT 
ACTATCGTTAAACAAGTAACAGAAGCTAAAATTGGTACTGCCGAGGTAAG 
T G G CAT GAT T AC GCGTTCTGTT GAT C AAG AT AAT GT T AGT C CGG TAG G C A 
GCCTCATCACAGAGGCTCAACTAGCAATTGCTCGAAAAAGCTGGCCAGAT 
ATCGATTTTGCCATGACAAATAATGGTGGCATTCGTGCTGACTTACTCAT 
C AAAC C AG AT GG AAC AAT C AC CT GGGGAG C T G C AC AAG C AGT T C AAC C T T 
TTGGTAATATCTTACAAGTCGTCGAAATTACTGGTAGAGATCTTTATAAA 
G C AC T C AAC G AAC AAT AC G ACC AAAAAC AAAAT TTCTTCCTT C AAAT AG C 
TGGTCTGCGATACACTTACACAGATAATAAAGAGGGCGGGGAAGAAACAC 
CAT T T AAAGT T GT AAAAG C T TAT AAAT C AAAT G GT G AG G AAAT CAAT C C T 
GATGCAAAATACAAATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGA 
TGGCTTTGCAAGCTTCAGAAATGCCAAACTTCTAGGAGCCATTAATCCCG 
ATACAGAGGTATTTATGGCCTATATCACTGATTTAGAAAAAGCTGGTAAA 
AAAGT GAG C GT T C C AAAT AAT AAAC CT AAAAT CT AT G T C AC TAT G AAGAT 
GGT T AAT GAAAC TAT T AC AC AAAAT GAT G GT AC AT AT AG CAT TAT T AAG A 
AAC T T T AT T TAG AT C G AC AAGG AAAT AT T GT AG C AC AAGAGAT T GT AT C A 
G AC ACT T T AAAC C AAAC AAAAT CAAAAT C T AC AAAAAT C AAC C CT G T AAC 
T AC AAT T C AC AAAAAAC AAT T AC AC C AAT T T AC AG C T AT T AAC C C TAT G A 
G AAAT T AT GG C AAAC C AT C AAACT CC ACT ACT GT AAAAT C AAAA 

SEQ ID NO. 7106 

STRAIN M732 

ACCAAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGCACTT 
G AC AAT AC T GG AAC AG C AAAT AT G C C T G AC G G AAAAGT T AC T AAT G C T G G 
CACTGCTGCT CAAT TAG AT GCT TAT AT G GAT GAT GCT C AAAAAG AT T T C A 
AAC AAAC T AAC C C T AAT GGT G AAAG CAT T AGAGT T C AAG CT GGT GAT AT G 
GTTGGAGC AAGT CCAGCTAACTCAGGGCTTCTTC AAGAT GAACCAACCGT 
T AAAAC AT T T AAT GC AAT G AAT G T T G AGT AT GG C AC AT T AGGT AAC CAT G 
AAT T T GAT G AAGGT T T GG C AG AAT AC AAT C G T AT CGT TACT G G AAAG G C C 
CCTGCT CC AG ATT CTAAT AT AAAT AAT ATT ACGAAAT CAT AC C C ACACGA 
AGCTGCAAAACAAGAAATTGTAGTGGCAAACGTTATTGATAAAGTTAACA 
AACAAATCCCTTACAATTGGAAACCTTACACTATTAAAAATATTCCTGTA 
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AAT AAC AAAAGT G T G AAC G T T G G C T T TAT CGG AAT CGT T AC C AAAG AC AT 
C CC AAAC CT T GT C T TAG GT AAAAAT TAT G AAC AAT AT G AAT T T T TAG AT G 
AAGCT GAAACAAT CGT T AAATACG C C AAAGAAT TAC AAGCT AAAAAT GT C 
AAGGCTATTGTAGTCCTTGCTCATGTACCTGCAACAAGCAAGGATGATAT 
T GCT GAAGGT G AAG C AG C AGAAAT GAT G AAAAAAGT C AAT C AAC T CT T C C 
CT GAAAAT AG C GT AG AT AT TGTCTTTGCTG G AC AC AAT CAT C AAT AT AC A 
AATGGTCTTGTTGGTAAAACTCGTATTGTACAAGCGCTCTCTCAAGGAAA 
AG C CT AT G CT G AT GTACGTGGTGTC C T AGAT AC T GAT AC AC AAG AT T T C A 
TTGAAACCCCTTCAGCTAAAGTAATTGCAGTTGCTCCTGGTAAAAAAACA 
G GT AGT G C CG AT AT T C AAG C CAT T GT T G AC C AAG C T AAT ACT AT CGT T AA 
AC AAGT AAC AG AAG CT AAAAT T G GT AC T G C C G AGGT AAG T GG C AT G AT T A 
CGCGTTCTGTTGATCAAGATAATGTTAGTCCGGTAGGCAGCCTCATCACA 
GAGGCTCAACTAGCAATTGCTCGAAAAAGCTGGCCAGATATCGATTTTGC 
CAT GACAAATAATGGTGGC ATT CGTGCTGACT TACT CAT CAAACCAGATG 
GAACAATCACCTGGGGAGCTGCACAAGCAGTTCAACCTTTTGGTAATATC 
TTACAAGTCGTCGAAATTACTGGTAGAGATCTTTATAAAGCACTCAACGA 
AC AAT AC GAC CAAAAAC AAAAT TTCTTCCTT C AAAT AGC T GGT C T G C GAT 
AC ACT T ACAC AGAT AAT AAAGAGGGCGGGGAAGAAACACCATT T AAAGT T 
GT AAAAG C T TAT AAAT C AAAT GGT GAG G AAAT C AAT C CT G AT GC AAAAT A 
CAAATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTTGCAA 
G CT T C AG AAAT GC C AAACT T CT AGG AG C C AT T AAT C C CG AT AC AGAG GT A 
T T T AT G G C CT AT AT C ACT GAT T T AG AAAAAG C T G GT AAAAAAGT GAG CAT 
T CC AAAT AAT AAACCT AAAAT C T AT GT C AC TAT G AAGAT GG T T AAT G AAA 
CTATTACACAAAATGATGGTACATATAGCATTATTAAGAAACTTTATTTA 
GAT CG ACAAGGAAAT ATT GT AGC ACAAGAGATTGT AT CAG ACACTT TAAA 
C C AAAC AAAAT C AAAAT C TAC AAAAAT C AACC C T GT AAC TAC AAT T C AC A 
AAAAAC AAT TACACCAAT TT ACAGCT AT T AAC CCT AT GAGAAAT T AT GGC 
AAACC AT C AAACT C C ACT ACT GT AAAAT C AAAAC AA 

SEQ ID NO. 7107 

STRAIN COH1 

AC C AAG TCGGTGTC C AAGT TAT AGG CGT C AAT GAC T T T CAT G GT GC ACT T 
GAC AAT ACT G G AAC AG C AAAT AT G C CT GACG G AAAAG T TAC T AAT G CT G G 
CACT GCT GCT C AATT AGATGCT T AT ATGGAT GAT GCT C AAAAAGAT TT CA 
AAC AAACT AAC C CT AAT GGT GAAAG CAT TAG AGT T C AAGC T G GT GAT AT G 
GTTGGAGCAAGTCCAGCTAACTCAGGGCTTCTTCAAGATGAACCAACCGT 
T AAAACAT T TAATGC AAT GAATGT TGAGTAT GGCACATT AGGTAACC AT G 
AAT T T GAT G AAG GT T T G G CAG AAT AC AAT C G TAT C GT T AC T GG AAAG GCC 
CCT GCT C C AGAT T CT AATAT AAAT AAT AT TACGAAAT CAT AC CC AC ACGA 
AG CT G C AAAAC AAGAAAT T GT AGT G G C AAAC G T TAT T GAT AAAGT T AAC A 
AACAAAT C C CT TAC AATT GG AAACCT TAC ACT AT T AAAAAT ATT CCT GT A 
AAT AAC AAAAGT GT G AAC GT T G G C T T TAT C GG AAT C GT T AC C AAAG AC AT 
C C C AAAC CTTGtCTTAC G T AAAAAT TAT G AAC AAT AT GAAT T T T TAG AT G 
AAG CT GAAACAAT C GT T AAAT AC GC C AAAG AAT TAC AAG C T AAAAAT GT C 
AAGG C TAT T G T AGT C C T T GCT C AT GT AC CT G C AAC AAGC AAG GAT GAT AT 
T GCT GAAGGT GAAGC AGCAG AAAT GATGAAAAAAGTCAATCAACTCTTCC 
CTG AAAAT AGC GT AG AT ATT GT CTTT GCT GG AC AC AAT CATC AAT AT AC A 
AATGGTCTTGTTGGTAAAACTCGTATTGTACAAGCGCTCTCTCAAGGAAA 
AG C CT AT GCT GAT GT AC GTGGTGTC CT AG AT AC T GAT AC AC AAG AT T T C A 
TTGAAACCCCTTCAGCTAAAGTAATTGCAGTTGCTCCTGGTAAAAAAACA 
GGTAGTGCCGATATT C AAGC CAT TGtTGACCAAGCTAAT ACT ATCGTTAA 
AC AAGT AAC AG AAGCT AAAAT TGGT ACT GCCG AGGT AAGT GGC ATG ATT A 
CGCGTTCTGTTGATCAAGATAATGTTAGTCCGGTAGGCAGCCTCATCACA 
GAG GCT C AAC TAG C AAT T G C T CG AAAAAG CT GG C CAG AT AT C GAT T T T G C 
CAT GAC AAAT AAT GGTGG CAT T C G T G C T G AC T T AC T CAT C AAAC CAG AT G 
GAACAATCACCTGGGGAGCTGCACAAGCAGTTCAACCTTTTGGTAATATC 
TTACAAGTCGTCGAAATTACTGGTAGAGATCTTTATAAAGCACTCAACGA 
AC AAT AC GAC CAAAAAC AAAAT TTCTTCCTT C AAAT AG CT G GT C T G C G AT 
ACAC T TAC AC AGAT AAT AAAG AGG G C GGG G AAG AAAC AC C AT T T AAAGT T 
G T AAAAG CT T AT AAAT C AAAT G GT G AGG AAAT C AAT C CTG AT GC AAAAT A 
C AAAT T AGT TAT C AAT G ACT T T T TAT TCGGTGGT GGT GAT G G CT T TG C AA 
G CT T CAG AAAT G C C AAAC T T CT AG GAG C C AT T AAT C C C GAT AC AG AGGT A 
T T T AT GG C C TAT AT C AC T GAT T TAG AAAAAG C T G GT AAAAAAG T GAG CAT 
T C C AAAT AAT AAAC C T AAAAT C T AT GT C AC TAT G AAG AT GGT T AAT G AAA 
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CT AT T AC AC AAAAT GAT G GT AC AT AT AG C AT T AT T AAGAAACT T TAT T T A 
GAT C G AC AAG G AAAT AT T GT AG C AC AAG AG AT T GT AT C AGAC ACT T T AAA 
C CAAAC AAAAT C AAAAT CT AC AAAAAT C AAC C CT GT AAC T AC AAT T C AC A 
AAAAAC AAT T AC AC C AAT T T AC AG C T AT T AAC C C TAT G AGAAAT T AT GGC 
AAACC AT C AAACT C C ACT AC T G T AAAAT C AAA 

SEQ ID NO. 7108 

STRAIN M781 

CAAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGCACTTGA 
CAATACTGGAACAGCAAATATGCCTGACGGAAAAGTTACTAATGCTGGCA 
CTGCTGCTCAATTAGATGCTTATATGGATGATGCTCAAAAAGATTTCAAA 
C AAAC T AAC C C T AAT GGT GAAAG CAT T AGAGT T C AAG CT G GT GAT AT GGT 
T G GAG C AAGT C C AG C T AAC T C AGGGC T T C T T C AAG AT G AAC C AAC C G TT A 
AAAC AT T T AAT G C AAT G AAT GT T GAGT AT GG C AC AT T AG GT AAC CAT GAA 
T T T GAT G AAG G T T T G G C AG AAT AC AAT CGT AT CGT TACT GGAAAGGC C C C 
TGCT C C AGAT T CTAAT AT AAAT AAT ATT ACGAAAT CAT ACC CACACGAAG 
CT G CAAAAC AAG AAAT T GT AGT GG CAAAC G T T AT T GAT AAAGT T AAC AAA 
CAAATCCCTTACAATTGGAAACCTTACACTATTAAAAATATTCCTGTAAA 
T AAC AAAAGT GT GAAC G T T GG C T T T AT CG GAAT CGT T AC C AAAGAC AT C C 
C AAAC CT T GT CT TACGT AAAAAT TAT GAAC AAT AT GAAT T T T T AGAT GAA 
G C T G AAAC AAT CGT T AAAT ACG C C AAAG AAT T AC AAG CT AAAAAT GT C AA 
G GC T AT T GT AGT C CT T G C T C AT GT AC CT G C AAC AAG C AAGG AT GAT AT T G 
CTGAAGGTGAAGCAGCAGAAATGATGAAAAAAGTCAAtCAACTCTTCCCT 
GAAAATAGCGTAGATATTGTCTTTGCTGGACACAATCATCAATATACAAA 
TGGTCTTGTTGGTAAAACTCGTATTGTACAAGCGCTCTCTCAAGGAAAAG 
C CT AT G C T G AT GT AC GT G G T GT C C TAG AT AC T GAT AC AC AAG AT T T CAT T 
G AAAC C C C T T C AG CT AAAGT AAT T G C AGT T G C T C CT GGT AAAAAAAC AGG 
TAGTGCCGATATTCAAGCCATTGtTGACCAAGCTAATACTATCGTTAAAC 
AAGT AAC AG AAG CT AAAAT T GGT ACT G C C G AG GT AAG T G GC AT G AT T AC G 
CGTTCTGTT GAT C AAG AT AAT GT T AGT C C GGT AG G C AGC CT CAT C AC AG A 
GGCTCAACTAGCAATTGCTCGAAAAAGCTGGCCAGATATCGATTTTGCCA 
TGACAAATAATGGTGGCATTCGTGCTGACTTACTCATCAAACCAGATGGA 
ACAATCACCTGGGGAGCTGCACAAGCAGTTCAACCTTTTGGTAATATCTT 
AC AAGT CGT CG AAAT T AC T GGT AG AG AT C T T TAT AAAG C ACT C AAC GAAC 
AATACGACCAAAAACAAAATTTCTTCCTTCAAATAGCTGGTCTGCGATAC 
ACT T AC AC AG AT AAT AAAG AGGGC GGG GAAGAAAC AC CAT T T AAAGT T GT 
AAAAGCTTATAAATCAAATGGTGAGGAAATCAATCCTGATGCAAAATACA 
AATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTTGCAAGC 
TTCAGAAATGCCAAACTTCTAGGAGCCATTAATCCCGATACAGAgGTATT 
TAT GGC C TAT AT C AC T GAT T T AGAAAAAGC T GGT AAAAAAGT GAG CAT T C 
C AAAT AAT AAAC CT AAAAT C T AT GT C ACT AT G AAG AT GG T T AAT G AAAC T 
AT T AC AC AAAAT GAT G GT AC AT AT AG CAT T AT T AAG AAACTT T AT T T AG A 
T CGAC AAGG AAAT AT T GT AGCACAAGAGATT GT AT C AGAC ACTTT AAACC 
AAACAAAATCAAAATCTACAAAAATCAACCCTGTAACTACAATTCACAAA 
AAAC AAT T AC AC C AAT T T AC AG C T AT T AAC C C T AT GAG AAAT TAT GG C AA 
AC CAT CAAAC T C C AC T AC T GT AAAAT C AAA 

SEQ ID NO. 7109 

STRAIN CJB110 

GACCAAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGC 
AC T T GAC AAT AC T G GAAC AG C AAAT AT GC CT G AC G G AAAAGT T AC T AAT G 
CTGGCACTGCTGCTCAATTAGATGCTTATATGGATGATGCTCAAAAAGAT 
TTCAAACAAACTAACCCTAATGGTGAAAGCATTAGAGTTCAAGCTGGTGA 
TATGGTTGGAGCAAGTCCAGCTAACTCAGGGCTTCTTCAAGATGAACCAA 
C C GT T AAAAC AT T T AAT G C AAT GAAT G T T GAG TAT G G C AC AT T AG GT AAC 
CAT GAAT T T GAT G AAG GT T T G G C AG AAT AC AAT C GT AT CGTTACTG G AAA 
GGCCCCTGCTCC AG ATT cTAAT AT AAAT AAT ATT ACGAAAT C AT ACCC AC 
AC GAAG C T G CAAAAC AAG AAAT T G T AGT G G CAAAC G T TAT T GAT AAAGT T 
AACAAACAAATCCCTTACAATTGGAAACCTTACGCTATTAAAAATATTCC 
TGTAAATAACAAAAGTGTGAACGTTGGCTTTATCGGAATCGTTACCAAAG 
ACATCCCAAACCTTGTCTTACGTAAAAATTATGAACAATATGAATTTTTA 
GAT GAAG C T G AAAC AAT CGT T AAAT AC G C C AAAG AAT T AC AAG C T AAAAA 
T GT C AAGGCT ATTGT AGT CCT TGCT CATGT AC CTGC AAC AAGC AAGG ATG 
AT AT TGCT G AAG GT GAAG C AG C AG AAAT GAT GAAAAAAG T C AAT C AACT C 
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T T C C CT G AAAAT AG CGT AGAT AT TGTCTTTGCTG GAC AC AAT CAT C AAT A 
TACAAATGGTCTTGTTGGTAAAACTCGCATTGTACAAGCGCTCTCTCAAG 
GAAAAGCCTATGCTGACGTACGTGGTGTCCTAGATACTGATACACAAGAT 
TTCATTGAAACCCCTTCAGCTAAAGTAGTTGCAGTTGCTCCTGGTAAAAA 
AACAGGTAGTGCCGATATTCAAGCCATTGTTGACCAAGCTAATACTATCG 
T T AAACAAGTAAC AGAAG CT AAAAT T G GT ACT G CC G AGGT AAGT GG C AT G 
ATTACGCGTTCTGTTGATCAAGATAATGTTAGTCCAGTAGGCAGCCTCAT 
C AC AGAG GC T C AAC T AG C AAT T G C T C GAAAAAG CT GG C C AG AT AT C GAT T 
T T GC CAT G AC AAAT AAT G GT GGC AT T C GT G CT G ACT TACT CAT C AAAC C A 
GATGGAACAATCACCTGGGGAGCTGCACAAGCAGTTCAACCTTTTGGTAA 
TAT C T T AC AAGT C GT CG AAAT TAG T GGT AG AG AT C T T T AT AAAG C AC T C A 
ACG AAC AAT AC GAC CAAAAAC AAAAT TTCTTCCTT C AAAT AG CT G GT C T G 
CGAT ACACTT ACACAGAT AAT AAAGAGGGCGGAGAAGAAAC AC CAT T T AA 
AGTTGTAAAAGCT TAT AAAT C AAAT GGT GAAGAAATC AAT CCTGATGCAA 
AATACAAATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTT 
G C AAGC T T C AG AAAT GC C AAAC T T C T AGG AG C CAT T AAT C C C GAT AC AG A 
G GT AT T TAT GGC C TAT AT C AC T GAT T T AGAAAAAGC T G G T AAAAAAGT G A 
G C G T T C C AAAT AAT AAAC C T AAAAT C T AT GT C AC TAT G AAG AT GGT T AAT 
GAAACTATTACACAAAATGATGGTACACATAGCATTATTAAGAAACTTTA 
T TTAGAT CGAC AAGGAAAT ATTGTAGCACAAGAGAT T GT AT C AG AC ACT T 
T AAAC C AAAC AAAAT C AAAAT C T AC AAAAAT C AAC C CT GT AACT AC AATT 
CACAAAAAACAATTACACCAATTTACAGCTATTAACCCTATGAGAAATTA 
T GGCAAACC AT CAAACT CCACT ACTGT AAAAT C A 

SEQ ID NO. 7110 

STRAIN 1169NT 

CAAGTCGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGCACTTGA 
C AAT AC T GG AAC AG C AAAT AT G C CT G AT GG AAAAGT T G C T AAT G C T GG T A 
CTGCTGCT C AAT T AGAT G C T TAT AT GG AT GAC GC T C AAAAAG AT T T C AAA 
C AAAC T AAC C C T AAT GGT G AAAG C AT T AGG GT T C AAG C AG G C GAT AT GGT 
T G GAG C AAGT C C AG C C AACT CTGGGCTTCTT C AAG AT G AAC C AACT GT C A 
AAAAT T T T AAT G C AAT G AAT GT T G AGT AT GGC AC AT T G G GT AAC CAT GAA 
TTTGATGAAGGGTTGGCAGAATATAATCGTATCGTTACTGGTAAAGCCCC 
TGCTCCAGATTCTAATATTAATAATATTACGAAATCATACCCACATGAAG 
CTGCAAAACAAGAAATTGTAGTGGCAAATGTTATTGATAAAGTTAACAAA 
CAAATTCCTTACAATTGGAAGCCTTACGCT ATT AAAAAT ATT CCTGT AAA 
TAACAAAAGTGTGAACGTTGGCTTTATCGGGATTGTCACCAAAGACATCC 
C AAAC CT T G T C T T AC GT AAAAAT T AT GAAC AAT AT G AAT T T T TAG AT GAA 
G C T G AAAC AAT CGT T AAAT AC G C C AAAGAAT T AC AAG C T AAAAAT G T C AA 
AG CT AT T GT AG t T CT C G C AC AT GT AC CT G C AAC AAGT AAAAAT GAT AT T G 
C T GAAGGT G AAGC AG C AG AAAT GAT G AAAAAAG T C AAT C AAC TCTTCCCT 
G AAAAT AG C GT AG AT AT TGTCTTTGCTG GAC AC AAT CAT C AAT AT AC AAA 
TGGTCTTGTTGGTAAAACTCGTATTGTACAAGCGCTCTCTCAAGGAAAAG 
CCTATGCTGATGTACGTGGTGTCTTAGATACTGATACACAAGATTTCATT 
GAGACCCCTTCAGCTAAAGTAATTGCAGTTGCTCCTGGTAAAAAAACAGG 
T AGT G C C GAT AT T C AAG C CAT T GT T GAC C AAG C T AAT AC TAT C GT T AAAC 
AAG T AAC AG AAGC T AAAAT TGGTACTGCC G AGGT AAG T G T CAT G AT T AC G 
CGTTCTGTT GAT C AAG AT AAT GT T AGT C C GG T AGG C AG C C T CAT C AC AG A 
GGCTCAACTAGCAATTGCTCGAAAAAGCTGGCCAGATATCGATTTTGCCA 
T GAC AAAT AAT GGT GG CAT T C GT G C T G AC T T AC T CAT C AAAC C AG AT GG A 
ACAATCACCTGGGGAGCTGCACAAGCAGTTCAACCTTTTGGTAATATCTT 
AC AAG T CGT C G AAAT T AC T GGT AG AG AT C T T TAT AAAG C AC T C AAC GAAC 
AAT AC GAC CAAAAAC AAAAT TTCTTCCTT C AAAT AG CT G GT C T G C GAT AC 
AC T T AC AC AG AT AAT AAAGAG GG CG GG G AAG AAAC AC CAT T T AAAGT T G T 
AAAAGC T TAT AAAT C AAAT G GT G AG G AAAT C AAT C C T GAT G C AAAAT AC A 
AATTAGTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTTGCAAGC 
T T C AGAAAT G C CAAACT T C T AGG AG C CAT T AAC C C C GAT AC AG AG G T AT T 
T AT GG C C TAT AT C ACT GAT T TAG AAAAAG C T G GT AAAAAAGT GAG C G T T C 
C AAAT AAT AAACC T AAAAT C T AT GT C AC T AT G AAG AT GGT T AAT G AAAC T 
AT T AC AC AAAAT G AT GG TAG AC AT AG CAT TAT T AAG AAAC T T TAT T TAG A 
T C GAC AAGGAAAT AT T G TAG C AC AAG AG AT T GT AT C AG AC AC T T T AAAC C 
AAACAAAATCAAAATCTACAAAAATCAACCCTGTAACTACAATTCACAAA 
AAAC AAT T AC AC C AAT T T AC AGC T AT T AAC C C T AT GAG AAAT TAT G G C AA 
ACCATCAAACTCCACTACTGTAAAATCAAA 
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SEQ ID NO. 7111 

STRAIN JM9130013 

CGGTGTCCAAGTTATAGGCGTCAATGACTTTCATGGTGCACTTGACAATA 
CTGGAACAGCAAATATGCCTGACGGAAAAGTTACTAATGCTGGCACTGCT 
G CT C AAT T AGAT GCTT AT AT GGAT GAT G C T C AAAAAG AT T TC AAAC AAAC 
TAACCCTAATGGTGAAAGCATTAGAGTTCAAGCTGGTGATATGGTTGGAG 
C AAGT C C AG CT AAC T C AG GGCTTCTT C AAG AT G AAC C AAC CGT T AAAAC A 
T T T AAT G C AAT G AAT G T T GAG T AT G G C AC AT TAG G T AAC CAT G AAT T T G A 
TGAAGGTTTGGCAGAATACAATCGTATCGTTACTGGAAAGGCCCCTGCTC 
C AG AT T c T AAT AT AAAT AAT AT TAG G AAAT CAT AC C C AC ACG AAG CT GC A 
AAAC AAGAAAT T GT AGT GGCAAACGTT ATT GAT AAAGTT AAC AAAC AAAT 
CCCTTACAATTGGAAACCTTACACTATTAAAAATATTCCTGTAAATAACA 
AAAGTGTGAACGTTGGCTTTATCGGAATCGTTACCAAAGACATCCCAAAC 
C T T GT C T T AC G T AAAAAT T AT G AAC AAT AT G AAT T T T T AGAT G AAG C T G A 
AAC AAT CGT T AAAT AC G C C AAAG AAT T AC AAG C T AAAAAT GT C AAG G C T A 
TTGTAGTCCTTGCTCATGTACCTGCAACAAGCAAGGATGATATTGCTGAA 
GGT GAAG C AG C AGAAAT GAT G AAAAAAG T C AAT C AAC TCTTCCCT G AAAA 
TAG C GT AG AT AT TGTCTTTGCTG G AC AC AAT CAT C AAT AT AC AAAT GGT C 
TTGTTGGTAAAACTCGTATTGTACAAGCGCTCTCTCAAGGAAAAGCCTAT 
GCTGATGTACGTGGTGTCCTAGATACTGATACACAAGATTTCATTGAAAC 
CCCTTCAGCTAAAGTAATTGCAGTTGCTCCTGGTAAAAAAACAGGTAGTG 
C CGAT AT T C AAG C CAT T GT T G AC C AAG C T AAT ACT AT C GT T AAAC AAGT A 
ACAGAAGCTAAAATTGGTACTGCCGAGGTAAGTGGCATGATTACGCGTTC 
T GT T GAT C AAG AT AAT GT T AGT C CGGT AG GC AG C CT CAT C AC AG AGG CT C 
AAC TAG C AAT T G C T C G AAAAAG C T GG C C AG AT AT C GAT T T TG C CAT G AC A 
AAT AAT GGT G G C AT T C GT G C T G AC T T AC T CAT C AAAC C AG AT G G AAC AAT 
C AC C T GGGG AG C T G C AC AAG C AGT T C AAC C T T T T GGT AAT AT C T T AC AAG 
T CGT C G AAAT T AC T G G TAG AG AT C T T TAT AAAG C ACT C AACG AAC AAT AC 
GACCAAAAACAAAATTTCTTCCTTCAAATAGCTGGTCTGCGATACACTTA 
C AC AG AT AAT AAAG AG G G C G G G GAAG AAAC AC CAT T T AAAGT T G T AAAAG 
C T TAT AAAT C AAAT GGT GAG G AAAT C AAT C CT G AT G C AAAAT AC AAAT T A 
GTTATCAATGACTTTTTATTCGGTGGTGGTGATGGCTTTGCAAGCTTCAG 
AAATGCCAAACTTCTAGGAGCCATTAATCCCGATACAGAGGTATTTATGG 
C CT AT AT C ACT GAT T TAG AAAAAG C T G GT AAAAAAGT GAG CGT T C C AAAT 
AATAAACCTAAAATCTATGTCACTATGAAGATGGTTAATGAAACTATTAC 
ACAAAATGATGGTACATATAGCATTATTGAGAAACTTTATTTAGATCGAC 
AAG GAAAT AT T GT AG C AC AAG AG AT T GT AT C AG AC AC T T T AAAC C AAAC A 
AAAT C AAAAT C T AC AAAAAT C AAC C C T G T AAC T AC AAT T C AC AAAAAAC A 
AT T AC AC C AAT T T AC AG C T AT T AAC C C T AT GAG AAAT TAT GG C AAAC CAT 
CAAACTCCACTACTGTAAAATCAAAA 



SEQ ID NO. 7112 

STRAIN 2 603 frame: 1 

MKKKIILKSSVLGLVAGTSIMFSSVFADQVGVQVIGVNDFHGALDNTGTANMPDGKVANA 
GTAAQLDAYMDDAQKDFKQTNPNGESIRVQAGDMVGASPANSGLLQDEPTVKNFNAMNVE 
YGTLGNHEFDEGLAEYNRIVTGKAPAPDSNINNITKSYPHEAAKQEIVVANVIDKVNKQI 
PYNWKPYAIKNIPVNNKSVNVGFIGIVTKDIPNLVLRKNYEQYEFLDEAETIVKYAKELQ 
AKNVKAIWLAHVPATSPCNDIAEGEAAEMMKKVNQLFPENSVDIVFAGHNHQYTNGLVGK 
TRIVQALSQGKAYADVRGVLDTDTQDFIETPSAKVIAVAPGKKTGSADIQAIVDQANTIV 
KQVTEAKIGTAEVSVMITRSVDQDNVSPVGSLITEAQLAIARKSWPDIDFAMTNNGGIRA 
DLLIKPDGTITWGAAQAVQPFGNILQVVEITGRDLYKALNEQYDQKQNFFLQIAGLRYTY 
TDNKEGGEETPFKVVKAYKSNGEEINPDAKYKLVINDFLFGGGDGFASFRNAKLLGAINP 
DTEVFMAYITDLEKAGKKVSVPNNKPKIYVTMKMVNETITQNDGTHSIIKKLYLDRQGNI 
VAQE I VS DT LNQTKSKS TKI N P VTT I HKKQLHQFT AIN PMRN YGKP SN STT VKS KQL PKT 
NSEYGQSFLMSVFGVGLIGIALNTKKKHMK 



SEQ ID NO. 7113 

STRAIN 090 frame: 3 

VGVQVIGVNDFHGALDNTGTANMPDGKVTNAGTAAQLDAYMDDAQKDFKQTNPNGESIRV 
QAGDMVGAS PANSGLLQDE PTVKT FNAMNVE YGTLGNHEFDEGLAEYNRI VTGKAPAPDS 
NINNITKSYPHEAAKQEIVVANVIDKVNKQIPYNWKPYAIKNIPVNNKSVNVGFIGIVTK 
D I PN LVLRKN Y EQ YE FL DE AE T I VK Y AKE LQ AKN VKAI V VL AHV PAT S KD D I AEGE AAEM 
MKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFIE 
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T P S AKWAVAPGKKTG S AD I QAI VDQANT I VKQVTE AKI GT AE VS GM I T RS VDQDN VS P V 
GSLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQWE 
ITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKWKAYKSNGEEINPDA 
KYKLVINDFLFGGGDGFAS FRNAKLLGAINPDTEVFMAYITDLEKAGKKVSVPNNKPKIY 
VTMKMVNETITQNDGTHSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIHKK 
QLHQFTAINPMRNYGKPSNSTTVKSKQ 

SEQ ID NO. 7114 

STRAIN A9G9 frame: 3 

VNDFHGALDNTGTANMPDGKVTNAGTAAQLDAYMDDAQKDFKQTNPNGESIRVQAGDMVG 
ASPANSGLLQDEPTVKTFNAMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAPDSNINNITK 
SYPHEAAKQEIWANVIDKVNKQIPYNWKPYTIKNIPVNNKSVNVGFIGIVTKDIPNLVL 
RKNYEQYEFLDEAETIVKYAKELQAKNVKAIVVLAHVPATSKDDIAEGEAAEMMKKVNQL 
FPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFIETPSAKVI 
AVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVSPVGSLITEA 
QLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQWEITGRDLY 
KALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKVVKAYKSNGEEINPDAKYKLVIN 
DFLFGGGDGFASFRNAKLLGAINPDTEVFMAYITDLEKAGKKVSVPNNKPKIYVTMKMVN 
ETITQNDGTYSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIHKKQLHQFTA 
INPMRNYGKPSNSTTVKSKQ 

SEQ ID NO. 7115 

STRAIN H3 6B frame: 2 

QVGVQVIGVNDFHGALDNTGTANMPDGKVTNAGTAAQLDAYMDDAQKDFKQTNPNGESIR 
VQAGDMVGASPANSGLLQDEPTVKTFNAMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAPD 
SNINNITKSYPHEAAKQEIWANVIDPCVNKQIPYNWKPYTIKNIPVNNKSVNVGFIGIVT 
KDIPNLVLRKNYEQYE FLDEAETIVKYAKELQAKNVKAIWLAHVPATSKDDIAEGEAAE 
MMKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFI 
ETPSAKVIAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVSP 
VGSLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQVV 
EITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKWKAYKSNGEEINPD 
AKYKLVINDFLFGGGDGFASFRNAKLLGAINPDTEVFMAYITDLEKAGKKVSVPNNKPKI 
YVTMKMVNETITQNDGTYSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIHK 
KQLHQFTAINPMRNYGKPSNSTTVKS 

SEQ ID NO. 7116 

STRAIN 18RS21 frame: 1 

DQ VG VQ V I G VN D FHG AL DN T G T ANM P D GKVXN AGT AAQ L D AYM D D AQK D FKQTN PN GE S I 
R VQ AG DM V GAS PAN SGLLQDEPTVKT FN AMN VE Y G T LGN HE F DE G L AE YNR I VT GKAP A P 
DSNINNITKSYPHEAAKQEIWANVIDKVNKQIPYNWKPYTIKNIPVNNKSVNVGFIGIV 
TKDIPNLVLRKNYEQYEFLDEAETIVKYAKELQAKNVKAIWLAHVPATSKDDIAEGEAA 
EMMKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDF 
IETPSAKVIAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVS 
PVGSLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQV 
VEITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKWKAYKSNGEEINP 
DAKYKLVINDFLFGGGDGFASFRNAKLLGAINPDTEVFMAYITDLEKAGKKVSVPNNKPK 
IYVTMKMVNETITQNDGTYSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIH 
KKQLHQFTAINPMRNYGKPSNSTTVKSK 

SEQ ID NO. 7117 

STRAIN M7 32 frame: 3 

QVGVQVIGVNDFHGALDNTGTANMPDGKVTNAGTAAQLDAYMDDAQKD FKQTN PNGE SIR 
VQAGDMVGASPANSGLLQDEPTVKT FNAMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAPD 
SNINNITKSYPHEAAKQEIWANVIDKVNKQIPYNWKPYTIKNIPVNNKSVNVGFIGIVT 
KDIPNLVLRKNYEQYEFLDEAETIVKYAKELQAKNVKAIWLAHVPATSKDDIAEGEAAE 
MMKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFI 
ETPSAKVIAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVSP 
VGSLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQVV 
EITGRDLYPCALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKVVKAYKSNGEEINPD 
AKYKLVINDFLFGGGDGFAS FRNAKLLGAINPDTEVFMAYITDLEKAGKKVSIPNNKPKI 
YVTMKMVNETITQNDGTYSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIHK 
KQLHQFTAINPMRNYGKPSNSTTVKSKQ 

SEQ ID NO. 7118 
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STRAIN COH1 frame: 3 

QVGVQVIGVNDFHGALDNTGTANMPDGKVTNAGTAAQLDAYMDDAQKDFKQTNPNGESIR 
VQAGDMVGAS PANSGLLQDE PTVKTFNAMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAPD 
SNINNITKSYPHEAAKQEIWANVI DKVNKQIPYNWKPYTIKNIPWNKSVNVGFIGIVT 
KDIPNLVLRKNYEQYEFLDEAETIVKYAKELQAKNVKAIWLAHVPATSKDDIAEGEAAE 
MMKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFI 
ETPSAKVIAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVSP 
VGSLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQW 
EITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKVVKAYKSNGEEINPD 
AKYKLVINDFLFGGGDGFAS FRNAKLLGAINPDTEVFMAYITDLEKAGKKVS I PNNKPKI 
YVTMKMVNETITQNDGTYSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIHK 
KQLHQFTAINPMRNYGKPSNSTTVKS 

SEQ ID NO. 7119 

STRAIN M781 frame: 1 

Q VG VQ V I G VN D FHG AL DN T GT ANM P DGKVTN AGT AAQ L D A YM D D AQKD FKQT N PN GE SIR 
VQAGDMVGAS PAN SGLLQDEPTVKT FN AMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAPD 
SNINNITKSYPHEAAKQEIVVANVIDKVNKQIPYNWKPYTIKNIPVNNKSVNVGFIGIVT 
KDIPNLVLRKNYEQYEFLDEAETIVKYAKELQAKNVKAIWLAHVPATSKDDIAEGEAAE 
MMKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFI 
ETPSAKVIAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVSP 
VGSLITEAQ LAI ARKS WPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQW 
EITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKVVKAYKSNGEEINPD 
AKYKLVINDFLFGGGDGFASFRNAKLLGAINPDTEVFMAYITDLEKAGKKVS I PNNKPKI 
YVTMKMVNET I TQNDGT YS 1 1 KKLYLDRQGN I VAQE I VS DT LNQTKS KS TKINPVTT IHK 
KQLHQFTAINPMRNYGKPSNSTTVKS 

SEQ ID NO. 7120 

STRAIN CJB110 frame: 1 

DQVGVQVIGVNDFHGALDNTGTANMPDGKVTNAGTAAQLDAYMDDAQKDFKQTNPNGESI 
RVQAGDMVGAS PANSGLLQDE PTVKT FN AMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAP 
DSNINNITKSYPHEAAKQEIWANVIDKVNKQIPYNWKPYAIKNIPVNNKSVNVGFIGIV 
TKDIPNLVLRKNYEQYEFLDEAETIVKYAKELQAKNVKAIWLAHVPATSKDDIAEGEAA 
EMMKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDF 
IETPSAKWAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVS 
PVGSLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQV 
VEITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKWKAYKSNGEEINP 
DAKYKLVINDFLFGGGDGFAS FRNAKLLGAINPDTEVFMAYITDLEKAGKKVSVPNNKPK 
IYVTMKMVNETITQNDGTHSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIH 
KKQ LHQ FT A I N PMRN YGK P S N S TT VK S 

SEQ ID NO. 7121 

STRAIN 1169NT frame: 1 

QVGVQVIGVNDFHGALDNTGTANMPDGKVANAGTAAQLDAYMDDAQKDFKQTNPNGESIR 
VQAGDMVGASPANSGLLQDEPTVKNFNAMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAPD 
SNINNITKSYPHEAAKQEIWANVI DKVNKQIPYNWKPYAIKNIPVNNKSVNVGFIGIVT 
KDIPNLVLRKNYEQYEFLDEAETIVKYAKELQAKNVKAIWLAHVPATSKNDIAEGEAAE 
MMKKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFI 
ETPSAKVIAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSVMITRSVDQDNVSP 
VGSLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQVV 
EITGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKWKAYKSNGEEINPD 
AKYKLVINDFLFGGGDGFAS FRNAKLLGAINPDTEVFMAYITDLEKAGKKVSV PNNKPKI 
YVTMKMVNETITQNDGTHSIIKKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIHK 
KQLHQFTAINPMRNYGKPSNSTTVKS 

SEQ ID NO. 7122 

STRAIN JM9130013 frame: 2 

GVQVIGVNDFHGALDNTGTANMPDGKVTNAGTAAQLDAYMDDAQKDFKQTNPNGESIRVQ 
AGDMVGAS PAN SGLLQDEPTVKT FNAMNVEYGTLGNHEFDEGLAEYNRIVTGKAPAPDSN 
INNITKS YPHEAAKQE I WANVI DKVNKQ I PYNWKPYT IKN I PVNNKS VNVGFIGIVTKD 
IPNLVLRKNYEQYEFLDEAETIVKYAKELQAKNVKAIWLAHVPATSKDDIAEGEAAEMM 
KKVNQLFPENSVDIVFAGHNHQYTNGLVGKTRIVQALSQGKAYADVRGVLDTDTQDFIET 
PSAKVIAVAPGKKTGSADIQAIVDQANTIVKQVTEAKIGTAEVSGMITRSVDQDNVSPVG 
SLITEAQLAIARKSWPDIDFAMTNNGGIRADLLIKPDGTITWGAAQAVQPFGNILQVVEI 
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TGRDLYKALNEQYDQKQNFFLQIAGLRYTYTDNKEGGEETPFKWKAYKSNGEEINPDAK 
YKLVINDFLFGGGDGFASFRNAKLLGAINPDTEVFMAYITDLEKAGKKVSVPNNKPKIYV 
TMKMVNETITQNDGTYSIIEKLYLDRQGNIVAQEIVSDTLNQTKSKSTKINPVTTIHKKQ 
LHQFTAINPMRNYGKPSNSTTVKSK 

SEQ ID NO. 7201 
STRAIN 2603 

ATGAATAAACGCGTAAAAATCGTTGCAACACTTGGTCCTGCGGTTGAATTCCGTGGTG 

GTAAGAAGTTTGGTGAGTCTGGATACTGGGGTGAAAGCCTTGACGTAGAAGCTTCAGCAG 

AAAAAATTGCTCAATTGATTAAAGAAGGTGCTAACGTTTTCCGTTTCAACTTCTCACATG 

GAGATCATGCTGAGCAAGGAGCTCGTATGGCTACTGTTCGTAAAGCAGAAGAGATTGCAG 

GACAAAAAGTTGGCTTCCTCCTTGATACTAAAGGACCTGAAATTCGTACAGAACTTTTTG 

AAGATGGTGCAGATTTCCATTCATATACAACAGGTACAAAATTACGTGTTGCTACTAAGC 

AAGGT AT C AAAT C AAC T C C AG AAGT G AT T G CAT T GAAT GT T G C T G GT GG AC T T G AC AT C T 

TTGATGACGTTGAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTG 

TGTTTGCAAAAGATAAAGACACTCGTGAATTTGAAGTAGTTGTTGAGAATGATGGCCTTA 

T T G GT AAAC AAAAAG GT G T AAAC AT C C CT T AT ACT AAAAT TCCTTTCC C AG C AC T T G C AG 

AACGCGATAATGCTGATATCCGTTTTGGACTTGAGCAAGGACTTAACTTTATTGCTATCT 

CATTTGTACGTACTGCTAAAGATGTTAATGAAGTTCGTGCTATTTGTGAAGAAACTGGsm 

AT GG AC AC GT T AAGT T GT T T GC T AAAAT T GAAAAT C AAC AAG GT AT CG AT AAT AT T GAT G 

AGATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTATCGAAGTTC 

CATTTGAAATGGTTCCAGTTTACCAAAAAATGATCATTACTAAAGTTAATGCAGCTGGTA 

AAGC AGT T AT T AC AG C AAC AAAT AT G C T T gAAAC AAT G ACT GAT AAAC C AC G T G CG ACT C 

GTTCAGAAGTATCTGATGTCTTCAATGCTGTTATTGATGGTACTGATGCTACAATGCTTT 

CAGGTGAGTCAGCTAATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTG 

ATAAAAATGCTCAAACATTACTCAATGAGTATGGTCGCTTAGACTCATCTGCATTCCCAC 

GTAATAACAAAACTGATGTTATTGCATCTGCGGTTAAAGATGCAACACACTCAATGGATA 

TCAAACTTGTTGTAACAATTACTGAAACAGGTAATACAGCTCGTGCCATTTCTAAATTCC 

GT C C AG AT G C AG AC AT TTTGGCTGT T AC AT T T GAT G AAAAAGT AC AAC G T T CAT T GAT G A 

TTAACTGGGGTGTTATCCCTGTCCTTGCAGACAAACCAGCATCTACAGATGATATGTTTG 

AGGTTGCAGAACGTGTAGCACTTGAAGCAGGATTTGTTGAATCAGGCGATAATATCGTTA 

TCGTTGCAGGTGTTCCTGTAGGTACAGGTGGAACTAACACAATGCGTGTTCGTACTGTTA 

AA 

SEQ ID NO. 7202 

STRAIN 0 90 

AATAAACGCGTAAAAATCGTTGCAACACT 

TGGTCCTGCGGTAGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTGGAT 
ACTGGGGTGAAAGCCTTGACGTAGAAGCTTCAGCAGAAAAAATTGCTCAA 
T T G ATT AAAG AAG GT G C T AAC GT TTTCCGTTT C AAC T T CT C AC AT GG AG A 
T C AT G CT GAG C AAGGAG CT CGT AT GG CT ACT GTTCGTAAAGC AG AAG AG A 
TTGCAGGACAAAAAGTTGGCTTCCTCCTTGATACTAAAGGACCTGAAATT 
C GT AC AGAAC T T T T T G AAG AT GGT T C AG AT T T C CAT T CAT AT AC AAC AG G 
T AC AG AAT T ACGT GT T G C TACT AAG C AAG GT AT C AAAT C AAC T C C AG AAG 
TGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTTGAA 
GTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGTGTT 
T GC AAAAG AT AAAG AC ACT C g T GAAT T T G AAGT AGT T GT T GAG AAT GAT G 
GCCTTATTGGTAAACAaaaaGGTGTAAACATCCCTTATACTAaAATTCCT 
TTCCCAgCACTTGCAGAACGCGATAATGCTGATATCCGTTTTGGACTTGA 
GCAAGGACTTAACTTTATTGCTATCTCATTTGTACGTACTGCTAAAGATG 
T T AAT G AAGT T CG T G C T AT T T GT GAAG AAAC T GG C AAT G G AC AT GT T AAG 
TT GTTTGCTAAAATT GAAAAT C AAC AAGGTATCGAT AAT ATT GATGAG AT 
TATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTATCG 
AAGTTCCATTTGAAATGGTTCCAGTTTACCAAAAAATGAT CAT TACT AAA 
GTT AATGCAGCT GGT AAAGCAGTT AT TACAGC AAC AAAT AT GCTT GAAAC 
AAT G ACT GAT AAAC C AC GT G C G AC T C GT T C AG AAGT AT CT GAT GT CT T C A 
ATGCTGTTATTGATGGTACTGATGCTACAATGCTTTCAGGTGAGTCAGCT 
AATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTGATAA 
AAAT G CT C AAAC AT T AC T C AAT G AGT AT GGT C G C T TAG AC T CAT CT G CAT 
TCCCACGTAATAACAAAACTGATGTTATTGCATCTGCGGTTAAAGATGCA 
AC AC ACT C AAT G GAT AT C AAAC T T G T T GT G AC AAT T AC T GAAAC AG GT AA 
TACAGCTCGTGCCATTTCTAAATTCCGTCCAGATGCAGACATTTTGGCTG 
TT AC AT T TG AT G AAAAAGT AC AAC GT T CAT T G ATG AT TAACT GGGGT GTT 
AT CCCTGTCCTTG C AG AC AAAC C AG C AT CT AC AG AT GAT AT GT T T G AG GT 
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TGCAGAACGTGTAGCACTTGAAGCAGGACTTGTTGAATCAGGCGATAATA 
T C G T TAT CGT TG C AG GT GT T C CT GT AG G T AC AGGT GGAACT AAC AC AAT G 
CGTGTTCGTACTGTTAAA 

SEQ ID NO. 7203 

STRAIN A909 

AATAAACGCGTAAAAATCGTTGCAACACTTGGTC 

CTGCGGTTGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTGGATACTGG 
G GT G AAAG C C T T G AC GT AG AAG CT T C AG C AG AAAAAAT T G C T C AAT T GAT 
T AAAG AAGGT GCT AACGTTTT CCGTTTCAACTTCTC AC AT GG AG AT C AT G 
C T GAG C AAGGAGCT CGT ATGGCTACTGT T CGTAAAGCAGAAGAGAT T GC A 
GGACAAAAAGTTGGCTTCCTCCTTGATACTAAAGGACCTGAAATTCGTAC 
AGAACTTTTTGAAGATGGTGCAGATTTCCATTCATATACAACAGGTACAA 
AAT T ACGTGT TGCT ACT AAGCAAGGT AT C AAAT C AACT CC AGAAGT GAT T 
GCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTTGAAGTTGG 
TAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGTGTTTGCAA 
AAG AT AAAGAC AC T CGT GAAT T T GAAGT AGT T GT T GAGAAT GAT G G C C TT 
ATTGGTAAAC AAAAAGGT GT AAACAT CCCTT AT ACTAAAAT T CCT TT CCC 
AG C AC T T G C AG AACG CGAT AAT GCT GAT AT C C GT T T T GG AC T T GAG C AAG 
G AC T T AAC T T TAT T GCT AT C T CAT T T G T ACG T AC T G C T AAAgAT GT T AAT 
GAAGTTCGTGCTATTTGTGAAGAAACTGGCAATGGACACGTTAAGTTGTT 
TGCT AAAATT GAAAATC AACAAGGT AT CGAT AAT ATT GAT GAG AT T AT CG 
AAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTATCGAAGTT 
CC AT T T G AAAT GG T T C C AGT T T AC C AAAAAAT GAT CAT T AC T AAAGT T AA 
T G C AG C T G GT AAAG C AGT TAT T AC AG C AAGAAAT AT GCT TGAAAC AAT G A 
CTGATAAACCACGTGCGACTCGTTCAGAAGTATCTGATGTCTTCAATGCT 
GTTATTGATGGTACTGATGCTACAATGCTTTCAGGTGAGTCAGCTAATGG, 
T AAAT AC C C AGT T G AGT C AGT T C GT AC AAT G G CT AC T AT T G AT AAAAAT G 
CTCAAACATTACTCAATGAGTATGGTCGCTTAGACTCATCTGCATTCCCA 
CGT AAT AACAAAACT G AT GT T ATTGCAT CTGCGGT T AAAG AT GCAAC AC A 
CT C AAT G GAT AT C AAACT T GT T GT AAC AAT TACT GAAAC AGGT AAT AC AG 
CTCGTGCCATTTCTAAATTCCGTCCAGATGCAGACATTTTGGCTGTTACA 
T T T GAT G AAAAAGT AC AAC GT T CAT T GAT GAT T AAC T GGGGT G T T AT CCC 
TGTCCTTG C AGAC AAAC C AG CAT C TAG AG AT G AT AT GT T T G AGGT T G C AG 
AACGTGTAGCACTTGAAGCAGGATTTGTTGAATCAGGCGATAATATCGTT 
AT CGT T G C AG GTGTTCCT GT AG G T AC AG GT G G AAC T AAC AC AAT G C G TG T 
TCGTACTGTTAAA 

SEQ ID NO. 7204 

STRAIN H36B 

AATAAACGCGTAAAAAT CGT T GCAAC 

ACTTGGTCCTGCGGTTGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
GATACTGGGGTGAAAGCCTTGACGTAGAAGCTTCAGCAGAAAAAATTGCT 
CAATTGATTAAAGAAGGTGCTAACGTTTTCCGTTTCAACTTCTCACATGG 
AG AT CAT GCT GAG C AAG G AGCT CG TAT GGC TAG T G T T C GT AAAG C AG AAG 
AG AT T GC AG GAC AAAAAGT T G G CT T C CT C C T T G AT ACT AAAGG AC CT G AA 
ATT CGT AC AG AAC T T T T T G AAG AT GGT GCAGAT TT CCAT T CAT AT AC AAC 
AGGT AC AAAAT T AC GT G T T G CT AC T AAGCAAGGT AT C AAAT C AACT C C AG 
AAGTGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTT 
GAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGT 
GT T T GC AAAAG AT AAAG AC ACT C GT GAAT TT G AAG T AGTT GT T GAGAAT G 
AT GG C CT T AT T GGT AAAC AAAAAGGTGT AAAC AT CCCTT AT ACT AAAATT 
CCTTTCCCAGCACTTGCAGAACGCGATAATGCTGATATCCGTTTTGGACT 
TGAGCAAGGACTTAACTTTATTGCTATCTCATTTGTACGTACTGCTAAAG 
AT GT T AAT G AAG T T C GT G CT AT T T G T G AAG AAAC T G G C AAT GG AC AC GT T 
AAG T T GT T T G C T AAAAT T GAAAAT C AAC AAGG T AT C GAT AAT AT T GAT G A 
GAT TAT C G AAG C AG C AG AT G GT AT T AT GAT TGCT C GT GGT G AT AT G G GT A 
T CG AAGT T C CAT T T G AAAT G GT T C C AG T T T AC C AAAAAAT GAT CAT TACT 
AAAGT T AAT G CAG C T GGT AAAG C AGTT ATT AC AGCAAC AAAT ATGCTTGA 
AAC AAT G Ac T GAT AAAC C AC GT G C G AC T C GT T C AG AAGT AT C T GAT GT C T 
T C AAT GCT G T TAT T GAT G GT ACT GAT G CT AC AAT G C T T T CAG G T G AGT C A 
GCTAATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTGA 
T AAAAATGCT C AAACATT ACT C AAT GAGT AT GGT CGCT T AGACT CAT CT G 
CAT T C C C AC GT AAT AAC AAAACT G AT GT T AT T G CAT CTGCGGTT AAAGAT 
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G C AAC AC ACT C AAT GG AT AT C AAACT T GT T GT AAC AAT T AC T G a AAC AGG 
T AAT AC AG C T CGT G C CAT T T C T AAAT T C C GT C C AG AT G C AG AC AT T T T GG 
C T GT T AC AT T T GAT G AAAAAG T AC AAC GT T CAT T GAT G AT T AACT GG G GT 
GT T AT CCCTGTCCT T GC AG AC AAAC C AG CAT CT AC AG AT GAT AT GT T T GA 
G GT T G C AG AACGT GT AG C AC T T GAAG C AG GAT T T G T T GAAT C AGGCG AT A 
ATATCGTTATCGTTGCAGGTGTTCCTGTAgGTACAGGTGGAACTAACACA 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7205 

STRAIN 18RS21 

AATAAACGCGTAAAAATCGTTGCAAC 

ACTTGGTCCTGCGGTTGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
GAT ACT GGGGT G AAAG C C T T G AC GT Ag AAG C T T C AG C AG AAAAAAT T G CT 
C AAT T GAT TAAAG AAGGT GC T AACGT TTTCCGTTT C AACT T CT C AC AT G G 
AG AT CAT G C T GAG C AAGG AG CT C GT AT GG C T AC T GT T C GT AAAG C AG AAG 
AG AT T G C AG G AC AAAAAGT TGGCTTCCTCCTT GAT AC T AAAGGAC C T G AA 
ATTCGTACAGAACTTTTTGAAGATGGTGCAGATTTCCATTCATATACAAC 
AG GT AC AAAAT TAG GT GT T G CT AC T AAG C AAG GT AT C AAAT C AAC T C C AG 
AAGTGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTT 
GAAG T T GGT AAGC AAAT C C T T GT T GAT GAT GGT AAAC T AGGT C T TACT G T 
GT T T G C AAAAGAT AAAG AC ACT C GTG AAT T T G AAGT AG T T GT T GAG AAT G 
ATGGCCTTATTGGTAAACAAAAAGGTGTAAACATCCCTTATACTAAAATT 
C CTT T C C C AG C ACT T G C AG AAC GC GAT AAT G CT G AT AT C C GT T T T GG AC T 
TGAGCAAGGACTTAACTTTATTGCTATCTCATTtGTACGTACTGCTAAAG 
AT GT T AAT G AAGT T C GT G C T AT T T GT G AAGAAAC T GG C AAT G G AC ACGT T 
AAGT T GTT TGCT AAAAT T GAAAAT C AAC AAGGT AT CG AT AAT AT T GAT G A 
GATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTA 
T C G AAGT T C CAT T T G AAAT GGT T C C AGT T T AC C AAAAAAT GAT CAT TACT 
AAAGT T AAT G C AG C T GGT AAAG C AGT TAT T AC AG C AAC AAAT AT G C T T G A 
AACAATGaCTGATAAACCACGTGCGACTCGTTCAGAAGTATCTGATGTCT 
TCAATGCTGTTATTGATGGTACTGATGCTACAATGCTTTCAGGTGAGTCA 
G CT AAT G GT AAAT AC C C AGT T GAG T C AGT T C GT AC AAT GG C T AC TAT T GA 
T AAAAAT G C T C AAAC AT T AC T C AAT GAG TAT GGT C G C T TAG ACT CAT C T G 
CATTCCCACGTAATAACAAAACTGATGTTATTGCATCTGCGGTTAAAGAT 
G C AAC AC AC T C AATG GAT AT C AAACT T GT T GT AAC AAT TAG T G AAAC AG G 
TAATACAGCTCGTGCCATTTCTAAATTCCGTCCAGATGCAGACATTTTGG 
CT GT T AC AT T T GAT G AAAAAG T AC AAC GT T CAT T GAT G AT T AAC T G G G GT 
GT T AT CCCTGTCCTTG C AG AC AAAC C AG CAT C T AC AG AT GAT AT GT T T G A 
GGTTGCAGAACGTGTAgCACTTGAAGCAGGATTTGTTGAATCAGGCGATA 
AT AT C GT T AT CGT T GC AGGT G T T C CT GT AgG T AC AG GT G G AAC T AAC AC A 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7206 

STRAIN M732 

AATAAACGCGTAAAAATCGTTGCAAC 

ACTTGGTCCTGCGGTAGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
GATACT GGGGT GAAAGCCTTGACGTAGAAGCTTCAGCAGAAAAAAT TGCT 
CAATTGATTAAAGAAGGTGCT AACGT TTTCCGTTTCAACTTCTC AC AT GG 
AGAT CAT G CT G AG C AAG GAG C T CG TAT G G CT AC T GT T CG TAAAG C AG AAG 
AGATTGCAGGACAAAAAGTTGGCTTCCTCCTTGATACTAAAGGACCTGAA 
AT T C GT AC AG AAC T T T T T GAAG AT GGT G C AG AT T T C CAT T CAT AT AC AAC 
AGGTACAAAATTACGTGTTGCTACTAAGCAAGGTATCAAATCAACTCCAG 
AAGT GAT TGCATT GAAT GT TGCT GGT GGACTTGACATCTTT GAT GACGTT 
GAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGT 
GTTTGCAAAAGATAAAGACACTCGTGAATTTGAAGTAGTTGTTGAGAATG 
AT GGCCTT ATT GGT AAAC AAAAAGGTGTAAACATCCCTT AT ACT AAAAT T 
CCTTTCCCAGCACTTGCAGAACGCGATAATGCTGATATCCGTTTTGGACT 
TGAGCAAGGACTTAACTTTATTGCTATCTCATTTGTACGTACTGCTAAAG 
AT G T T AAT GAAG T T C GT G C T AT T T GT GAAG AAACT GG C AAT GG AC ACGT T 
AAGT TGTTTGCT AAAAT T GAAAAT C AAC AAG GT AT C GAT AAT AT T GAT G A 
GATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTA 
TCGAAGTTCCATTTGAAATGGTTCCAGTTTACCAAAAAATGATCATTACT 
AAAGT TAATGCAGCT GGT AAAGC AGT T ATT ACAGC AAC AAAT AT GCTTGA 
AAC AAT G ACT GAT AAAC C AC GT GC G AC T CGT T C AG AAGT AT C T GAT GT C T 
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TCAATGCTGTTATTGATGGTACTGATGCTACAATGCTTTCAGGTGAGTCA 
GCTAATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTGA 
T AAAAATGCT CAAACATT ACT CAAT GAGT ATGGT CGCTT AGACT C AT CT G 
CATTCCCACGTAATAACAAAACTGATGTTATTGCATCTGCGGTTAAAGAT 
GC AAC AC ACT CAAT GG AT AT C AAACT T GT T G T AAC AAT T AC T G AAAC AG G 
TAATACAGCTCGTGCCATTTCTAAATTCCGTCCAGATGCAGACATTTTGG 
CTGTTACATTTGATGAAAAAGTACAACGTTCATTGATGATTAACTGGGGT 
GT T AT CCCTGTCCTTG C AG AC AAAC C AGC AT CT AC AG AT GAT AT GT T T G A 
GGTTGCAGAACGTGTAgCACTTGAAGCAGGACTTGTTGAATCAGGCGATA 
AT AT C GT T AT C G T T G C AGGT G TT C C T GT AG GT AC AG GT GG AACT AAC AC A 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7207 

STRAIN COH1 

AATAAACGCGTAAAAATCGTTGCAAC 

ACT TGGTCCTGCGGTAGAATTCCGTGGTGGTAAGAAGTTTGGT GAGT CTG 
GATACTGGGGTGAAAGCCTTGACGTAGAAGCTTCAGCAGAAAAAATTGCT 
CAAT T G AT T AAAG AAGGT G CT AAC GT T TTCCGTTT C AAC T T CT C AC AT G G 
AG AT CAT G C T G AG C AAGG AG CT C G T AT GGC T AC T GT T C GT AAAG C AG AAG 
AGATTGCAGGACAAAAAGTTGGCTTCCTCCTTGATACTAAAGGACCTGAA 
AT T C GT AC AG AAC t T T T T G AAG AT G GT G C AG AT T T C CAT T CAT AT AC AAC 
AGGTACAAAATTACGTGTTGCTACTAAGCAAGGTATCAAATCAACTCCAG 
AAGTGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTT 
GAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGT 
GT T T G C AAAAGAT AAAG AC AC T C GT G AAT T T GAAGT AG T T GT T GAG AAT G 
ATGGCCTTAtTGGTAAACAAAAAGGTGTAAACATCCCTTATACTAAAATT 
CCTTTCCCAGCACTTGCAGAACGCGATAATGCTGATATCCGTTTTGgACT 
T G AG C AAGG AC T T AAC T T TAT T G C TAT C T CAT T T GT AC GT ACT G C T AAAG 
ATGTTAATGAAGTTCGTGCTATTTGTGAAGAAACTGGCAATGGACACGTT 
AAGTTGTTTGCTAAAATTGAAAATCAACAAGGTATCGATAATATTGATGA 
GATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTA 
T C GAAGT T C CAT T T GAAAT GGT T C C AGT T T AC C AAAAAAT GAT C AT T AC T 
AAAGT T AAT GC AG C T G GT AAAG C AGT TAT T AC AGC AAC AAAT AT G C T T G A 
AAC AAT GAC T G AT AAAC C AC GT G C G ACT C GT T C AG a AGT AT CT G AT G T C T 
TCAATGCTGTTATTGATGGTACTGATGCTACAATGCTtTCAGGTGAGTCA 
GCTAATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTGA 
T AAAAATGCT CAAACATT ACT CAATGAGT ATGGT CGcTT AGACT CAT CTG 
CAT T C C C AC GT AAT AAC AAAACT GAT G T TAT T G CAT C T GC GGT T AAAG AT 
G C AAC AC AC T CAAT GGAT AT C AAACT T G T T GT AAC AAT TACT G AAAC AG G 
T AAT AC AGC T C GT GC CAT T T C T AAAT T C CGT C C AG AT G C AG AC AT T T T GG 
CTGTTACATTTGATGAAAAAGTACAACGTTCATTGATGATTAACTGGGGT 
GT T AT CCCTGTCCTT GC AGAC AAAC C AGC AT C T AC AG AT GAT AT GT T T G A 
GGT T G C AGAAC GT GT AGC ACT T GAAG C AGG AC T T G T T G AAT C AG G C GAT A 
ATATCGTTATCGTTGCAGGTGTTCCTGTAGGTACAGGTGGAACTAACACA 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7208 

STRAIN M7 81 

AATAAACGCGTAAAAATCGTTGCAAC 

ACTTGGTCCTGCGGTAGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
GATACTGGGGTGAAAGCCTTGACGTAGAAGCTTCAGCAGAAAAAATTGCT 
CAATTGATTAAAGAAGGTGCTAACGTTTTCCGTTTCAACTTCTCACATGG 
AGATCATGCTGAGCAAGGAGCTCGTATGGCTACTGTTCGTAAAGCAGAAG 
AGATTGCAGGACAAAAAGTTGGCTTCCTCCTTGATACTAAAGGACCTGAA 
AT T C G T AC AG AAC T T T T T GAAG AT GGT G C AG AT T T C CAT T CAT AT AC AAC 
AG GT AC AAAAT TACGTGTTGC TACT AAG C AAGGT AT C AAAT C AAC T C C AG 
AAGTGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTT 
GAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGT 
GT T T G C AAAAGAT AAAG AC AC T C G T G AAT T T GAAGT AGT T GT T GAG AAT G 
ATGGCCTTATTGgTAAACAAAAAGGTGTAAACATCCCTTATACTAAAATT 
CCTTTCCCAGCACTTGCAGaaCGCGATAATGCTGATATCCGTTTTGGACT 
T GAG C AAGG ACT T AACT T TAT T G C TAT C T CAT T T G T AC GT AC T G C T AAAG 
ATGTTAATGAAGTTCGTGCTATTTGTGAAGAAACTGGCAATGGACACGTT 
AAG T T G T T T G C T AAAAT T G AAAAT C AAC AAG GT AT C GAT AAT AT T GAT G A 
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GATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTA 
T C GAAGT T C CAT T T GAAAT GGT T C C AGT T TAG C AAAAAAT GAT CAT TACT 
AAAGTTAATGCAGCTGGTAAAGCAGTTATTACAGCAACAAATATGCTTGA 
AACAATGACTGATAAACCACGTGCGACTCGTTCAGAAGTATCTGATGTCT 
TCAATGCTGTTATTGATGGTACTGATGCTACAATGCTTTCAGGTGAGTCA 
GCTAATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTGA 
TAAAAATGCTCAAACATTACTCAATGAGTATGGTCGCTTAGACTCATCTG 
CAT T C C C AC GT AAT AAC AAAAC T G AT GT T AT T G CAT C T G C G G T T AAAG AT 
GCAACACACTCAATGGATATCAAACTTGTTGTAACAATTACTGAAACAGG 
T AAT AC AG CT CGT G C CAT T T C T AAGT T C C GT C C AG AT G C AG AC AT T T T G G 
CTGTTACATTTGATGAAAAAGTACAACGTTCATTGATGATTAACTGGGGT 
GT T AT C C C T GT C C T T G C AG AC AAAC C AG CAT C T AC AGAT GAT AT G T T T GA 
GGTTGCAGAACGTGTAGCACTTGAAGCAGGACTTGTTGAATCAGGCGATA 
ATATCGTTATCGTTGCAGGTGTTCCTGTAGGTACAGGTGGAACTAACACA 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7209 

STRAIN CJB110 

AATAAACGCGTAAAAATCGTTGCAAC 

ACTTGGTCCTGCGGTTGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
GAT AC T G G GGT G AAAG C CT T G AC GT Ag AAG CT T C AG C AGAAAAAAT T G C T 
CAATTGATTAAAGAAGGTGCTAACGTTTTCCGTTTCAACTTCTCACATGG 
AGAT CATGCTGAGCAAGGAGCT CGT ATGGCTACTGTTCGTAAAGCAGAAG 
AGAT T G C AG GAC AAAAAGT T GGCTTCCTCCTT GAT AC T AAAGG AC CT G AA 
AT T C GT AC AG AAC T T T T T G AAG AT G GT G C AG AT T T C CAT T CAT AT AC AAC 
AGGTACAAAATTACGTGTTGCTACTAAGCAAGGTATCAAATCAACTCCAG 
AAGTGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTT 
GAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGT 
G T TT G C AAAAG AT AAAG AC AC T C GT GAAT T T GAAGT AG T T G T T GAG AAT G 
ATGGCCTTAtTGGTAAACAAAAAGGTGTAAACATCCCTTATACTAAAATT 
CCTTTCCCAGCACTTGCAGAACGCGATAATGCTGATATCCGTTTTGGACT 
T G AAC AAG GAC T T AACT T T AT T G C T AT CT CAT T T GT ACGT ACT G C T AAAG 
ATGTTAATGAAGTTCGTGCTATTTGTGAAGAAACTGGCAATGGACACGTT 
AAGT T GT T T G C T AAAAT T GAAAAT C AAC AAG GT AT C GAT AAT AT T GAT G A 
GATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTA 
T C GAAGT T C CAT T T GAAAT G GT T C C AGT T T AC C AAAAAAT GAT CAT TACT 
AAAGTTAATGCAGCTGGTAAAGCAGTTATTACAGCAACAAATATGCTTGA 
AAC AAT GAC T GAT AAAC C AC GT GC G ACT CGT T C AG AAGT AT C T GAT GT CT 
TCAATGCTGTTATTGATGGTACTGATGCTACAATGCTTTCAGGTGAGTCA 
GCTAATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTGA 
TAAAAATGCTCAAACATTACTCAATGAGTATGGTCGCTTAGACTCATCTG 
CATT C CCACGT AAT AAC AAAACTGAT GT T AT T GCAT CT GCGGTTAAAGAT 
G C AAC AC AC T C AAT G GAT AT C AAACT T GT T G T AAC AAT T AC T G AAAC AGG 
TAATACAGCTCGTGCCATTTCTAAATTCCGTCCAGATGCAGACATTTTGG 
C T G T T AC AT T TG AT G AAAAAGT AC AACGTT CAT T GAT G ATT AAC T G G G GT 
G T TAT CCCTGTCCTTG C AG AC AAAC C AG CAT C T AC AG AT GAT AT GT T T G A 
GGT T G C AG AAC GT GT AG C AC T T G AAG C AGG AT T T GT T GAAT C AG G C GAT A 
ATATCGtTATCGTTGCAGGTGTTCCTGTAGGTACAGGTGGAACTAACACA 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7210 

STRAIN 1169NT 

AATAAACGCGTAAAAATCGTTGCAAC 

ACTTGGTCCTGCGGTAGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
GAT ACT GGGGTGAAAGCCTTGACGT AG AAGCTTC AG C AGAAAAAAT TGCT 
C AAT T GAT T AAAG AAG GT G CT AAC GT T TT C CG T T T C AAC T T CT C AC AT GG 
AGAT CATGCTGAGCAAGGAGCT CGT ATGGCTACTGTTCGTAAAGCAGAAG 
AGAT TGC AGG AC AAAAAGT TGGCTTCCTCCTT GAT ACT AAAGG ACCTGAA 
AT T C G T AC AG AAC T T T T T G AAG AT G GT G C AG AT T T C C AT T CAT AT AC AAC 
AGGTACAAAATTACGTGTTGCTACTAAGCAAGGTATCAAATCAACTCCAG 
AAGTGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTT 
GAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGT 
GT T T GC AAAAGAT AAAGAC ACT CGT GAATT TGAAGT AGT T GT T GAGAAT G 
ATGGCCTTATTGGTAAACAAAAAGGTGTAAACATCCCTTATACTAAAATT 
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CCTTTCCCAGCACTTGCAGAACGCGATAATGCTGATATCCGTTTTGGACT 
TGAGCAAGGACTTAACTTTATTGCTATCTCATTTGTACGTACTGCTAAAG 
AT GT T AAT G AAGT T C GT G C TAT T T GT G AAGAAAC T GG C AAT GG AC ACGT T 
AAG T T GT T T G c T AAAAT T G AAAAT C Aa C AAGG T AT CG AT AAT AT T GAT GA 
GATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTA 
T C G AAGT T C CAT T T GAAAT G GT T C C AGT T T AC C AAAa AAT GAT CAT TACT 
Aa AGT T AAT G C AGCT GGT AAAG C AG T TAT TAG AG C AAC AAAT AT G C T T G A 
AAC AAT GAC T GAT AAAC C ACGT G C G ACT C GT T C AG AAGT AT CT G AT GT C T 
T C AAT G C T GT TAT T GAT GGT AC T GAT G C T AC AAT G C T T T C AGGT G AGT C A 
G C T AAT GGT AAAT AC C C AGT T G AGT C AGT T CG T AC AAT GG C TACT AT T G A 
TAAAAATGCTCAAACAttACTCAATGAGTATGGTCGTTTAGACTCATCTG 
CAT T C C C AC G T AAT AAC AAAAC T GAT GT TAT T G CAT C T G C GGT T AAAG AT 
G C AAC AC AC T C AAT G G AT AT C AAAC TT G T T G T AAC AAT TACT G AAAC AGG 
T AAT AC AG CT C GT GC C AT T T C T AAAT T C C GT C C AG AT G C AGAC AT T T T G G 
CTGTTACATTTGATGAAAAAGTACAACGTTCATTGATGATTAACTGGGGT 
G T T AT CCCTGTCCTTG C AG AC AAAC C AG CAT C T AC AG AT GAT AT G T T T G A 
G GT T G C AGAAC GT GT AG C ACT T GAAG C AGG ACT T GT T G AAT C AGG C GAT A 
AT AT CGT TAT C GT T G C AG GT G T T C C T G T AGGT AC AG GT GG AAC T AAC AC A 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7211 

STRAIN JM9130013 

AAT AAACGCGT AAAAAT C GT T G C AAC 

ACTTGGTCCTGCGGTAGAATTCCGTGGTGGTAAGAAGTTTGGTGAGTCTG 
GATACTGGGGTGAAAGCCTTGACGTAGAAGCTTCAGCAGAAAAAATTGCT 
CAATTGATTAAAGAAGGTGCTAACGTTTTCCGTTTCAACTTCTCACATGG 
AG AT C ATGC TG AGC AAGG AG CT C GT AT GG C T AC T G T T C G T AAAG C AG AAG 
AG AT T GC AGG AC AAAAAGT T GGC T TCCTCCTT GAT ACT AAAG GAC C T G AA 
AT T C GT AC AG AACT T T T T G AAGAT G GT T C AG AT T T C CAT T CAT AT AC AAC 
AG GT AC AAAAT T AC GT GT T G C TACT AAG C AAG GT AT C AAAT C AACT C C AG 
AAGTGATTGCATTGAATGTTGCTGGTGGACTTGACATCTTTGATGACGTT 
GAAGTTGGTAAGCAAATCCTTGTTGATGATGGTAAACTAGGTCTTACTGT 
GTT T GCAAAAGAT AAAGACACTCGT GAATTT GAAGT AGTT GTTGAGAATG 
AT GGCCTT ATT GGT AAAC AAAAAGGTGT AAAC AT CCCTT AT ACT AAAAT T 
CCTTTCCCAGCACTTGCAGAACGCGATAATGCTGATATCCGTTTTGGACT 
TGAGCAAGGACTTAACTTTATTGCTATCTCATTTGTACGTACTGCTAAAG 
ATGTTAATGAAGTTCGTGCTATTTGTGAAGAAACTGGCAATGGACATGTT 
AAGT T GT T T G CT AAAAT T G a AAAT C Aa C AAGG TAT C GAT AAT AT T GAT G A 
GATTATCGAAGCAGCAGATGGTATTATGATTGCTCGTGGTGATATGGGTA 
T C GAAGT T C CAT T T GAAAT GGT T C C AG T T T AC C AAAAAAT GAT CAT TAG T 
AAAGTTAATGCAGCTGGTAAAGCAGTTAttACAGCAACAAATATGCTTGA 
AAC AAT GAC T GAT AAAC C ACGT G CG AC T CGT T C AGAAGT AT CT G AT GT CT 
TCAATGCTGTTATTGATGGTACTGATGCTACAATGCTTTCAGGTGAGTCA 
GCTAATGGTAAATACCCAGTTGAGTCAGTTCGTACAATGGCTACTATTGA 
T AAAAAT G CT C AAAC AT TACT C AAT G AGT AT GGT CGCT T AG AC T CAT C T G 
CATTCCCACGTAATAaCAAAACTGATGTTATTGCATCTGCGGTTAAAGAT 
GCAACACACTCAATGGATATCAAACTTGTTGTGACAATTACTGAAACAGG 
TAATACAGCTCGTGCCATTTCTAAATTCCGTCCAGATGCAGACATTTTGG 
CT GTT AC AT T T GAT G AAAAAG T AC AAC GT T CAT T GAT GAT T AAC T G G G GT 
GT T AT CCCTGTCCTTG C AG AC AAAC C AG CAT C T AC AG AT G AT AT GT T T GA 
GGTTGCAGAACGTGTAgcACTTGAAGCAGGACTTGTTGAATCAGGCGATA 
ATATCGTTATCGTTGCAGGTGTTCCTGTAGGTACAGGTGGAACTAACACA 
ATGCGTGTTCGTACTGTTAAA 

SEQ ID NO. 7212 
STRAIN 2603 frame: 1 

MNKRVKIVATLGPAVEFRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHG 
DHAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQ 
GIKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVVVENDGLI 
GKQKGVNIPYTKIPFPALAERDNADIRFGLEQGLNFIAISFVRTAKDVNEVRAICEETGX 
GHVKLFAKIENQQGIDNIDEIIEAADGXMIARGDMGIEVPFEMVPVYQKMIITKVNAAGK 
AVITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATID 
KN AQT L LN EYGRLDSSAFP RNN KT D V IAS AVK D AT H S M D I K L VVT I T E T GNT ARA I S K FR 
PDADILAVTFDEKVQRSLMINWGVIPVLADKPASTDDMFEVAERVALEAGFVESGDNIVI 
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VAGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7213 

STRAIN 0 90 frame: 1 

NKRVKIVATLGPAVE FRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGSDFHSYTTGTELRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVWENDGLIG 
KQKG VN I P YTK I P F P ALAE R DN AD I R FG LE QGLN F I AI S FVRT AK D VN E VRAI C E E T GNG 
HVKLFAKIENQQGIDNIDEIIEAADGIMIARGDMGIEVPFEMVPVYQKMIITKVN AAGKA 
VITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATIDK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLWTITETGNTARAISKFRP 
DAD I LAVT FDEKVQRS LMI NWG V I P VLADK PAS T D DMFE VAE RVALE AG LVE S G DN I VI V 
AGV PVGTGGTNTMRVRT VK 

SEQ ID NO. 7214 

STRAIN A909 frame: 1 

NKRVKIVATLGPAVE FRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVWENDGLIG 
KQKGVNIPYTKIPFPALAERDNADIRFGLEQGLNFIAISFVRTAKDVNEVRAICEETGNG 
H VKL FAK I EN QQG I DN I DE 1 1 E AAD G I M I ARG DMG I E V P FE MV P V YQKM 1 1 T KVN AAGKA 
V I T ATNMLETMT DK PRATRS E VS DVFN AVI DGT DATML S GE S ANGKY PVE S VRTMAT I DK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLVVTITETGNTARAISKFRP 
DADILAVTFDEKVQRSLMINWGVIPVLADKPASTDDMFEVAERVALEAGFVESGDNIVIV 
AGV P VG T G GTNTMR VRT VK 

SEQ ID NO. 7215 

STRAIN H3 6B frame: 1 

NKRVKIVATLGPAVE FRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVVVENDGLIG 
KQKGVNIPYTKIPFPALAERDNADIRFGLEQGLNFIAISFVRTAKDVNEVRAICEETGNG 
HVKLFAKIENQQGIDNI DEI IEAADGIMI ARG DMGIEVPFEMVPVYQKMIITKVN AAGKA 
VITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATIDK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLWTITETGNTARAISKFRP 
DAD I LAVT FDEKVQRS LM IN WG VI P VLADK PAST D DMFE VAE RVALE AG FVE S GDN I VI V 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7216 

STRAIN 18RS21 frame: 1 

NKRVKIVATLGPAVEFRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEWVENDGLIG 
KQKGVN I PYTKI PFPALAERDNAD IRFGLEQGLN FI AI S FVRTAKDVNE VRAI CEETGNG 
HVKLFAKIENQQGI DNI DE I IE AADGI MI ARG DMG I EVP FEMVPVYQKMI ITKVNAAGKA 
VITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATIDK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLVVTITETGNTARAISKFRP 
DADILAVTFDEKVQRSLMINWGVIPVLADKPASTDDMFEVAERVALEAGFVESGDNIVIV 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7217 

STRAIN M732 frame: 1 

NKRVKIVATLGPAVE FRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEWVENDGLIG 
KQKGVN I PYTKI P FPALAERDN AD IRFGLEQGLN F I AI S FVRT AKDVNEVRAI CEETGNG 
HVKLFAKIENQQGIDNI DE I IEAADGIMIARGDMGIEVPFEMVPVYQKMI ITKVNAAGKA 
VITATNMLETMT DKPRATRS EVS DVFNAVI DGTDATMLS GE SANGKYPVE S VRTMAT I DK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLVVTITETGNTARAISKFRP 
DA D I LAVT F DE KVQR S LM I N W G V I P V L ADK P A S T D DM FE VAE R VAL E AG LVE S G DN I V I V 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7218 

STRAIN COH1 frame: 1 
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NKRVKIVATLGPAVEFRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVVVENDGLIG 
KQKGVNIPYTKIPFPALAERDNADIRFGLEQGLNFIAISFVRTAKDVNEVRAICEETGNG 
HVKLFAKIENQQGIDNIDEIIEAADGIMIARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
V I T ATNMLETMT DKPRATRS E VS DV FN AV I DGT D ATML S GE S ANGK Y PVE S VRTMAT I DK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLWTITETGNTARAISKFRP 
DADILAVTFDEKVQRSLMINWGVIPVLADKPASTDDMFEVAERVALEAGLVESGDNIVIV 
AG V P VGT G GTN TMRVRT VK 

SEQ ID NO. 7219 

STRAIN M7 81 frame: 1 

N KRVK I VAT LG PAVE FRG GKK FGE S G YWGE S L D VE AS AE K I AQL I KE G AN V FR FN F S HG D 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVWENDGLIG 
KQKGVNI PYTKI P FPALAERDNADIRFGLEQGLNFI AI S FVRTAKDVNEVRAI CEETGNG 
HVKLFAKIENQQGI DNI DE I IEAADGIMI ARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
VITATNMLETMT DKPRATRS EVSDVFNAVI DGT DATMLSGE S ANGKYPVES VRTMAT I DK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLWTITETGNTARAISKFRP 
DAD I LAVT F DE KVQR S LM I N W G V I P VL AD K PAS T D DM FE VAE R VALE AG L VE S G DN I VI V 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7220 

STRAIN CJB110 frame: 1 

N KRVK I VAT L G PAVE FRGGKK FGE SGYWGESL D VE AS AE K I AQ L I KE G AN V FR FN F S HG D 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVWENDGLIG 
KQKGVNI PYTKI PFPALAERDNADIRFGLEQGLNFIAI S FVRTAKDVNEVRAI CEETGNG 
HVKLFAKIENQQGI DNI DEI IEAADGIMI ARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
VITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATIDK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLVVTITETGNTARAISKFRP 
DAD I LAVT FDE KVQR S LM I N W G V I P VL AD K PAS T D DM FE VAE R VALE AG FVE S G DN I V I V 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7221 

STRAIN 1169NT frame: 1 

NKRVKIVATLGPAVEFRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGADFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTREFEVWENDGLIG 
KQKGVN I PYTKI PFPALAERDNADIRFGLEQGLNFIAI S FVRTAKDVNEVRAI CEETGNG 
HVKLFAKIENQQGI DNI DEI IEAADGIMIARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
VITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATIDK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLWTITETGNTARAISKFRP 
DADILAVTFDEKVQRSLMINWGVIPVLADKPASTDDMFEVAERVALEAGLVESGDNIVIV 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7222 

STRAIN JM9130013 frame: 1 

NKRVKIVATLGPAVEFRGGKKFGESGYWGESLDVEASAEKIAQLIKEGANVFRFNFSHGD 
HAEQGARMATVRKAEEIAGQKVGFLLDTKGPEIRTELFEDGSDFHSYTTGTKLRVATKQG 
IKSTPEVIALNVAGGLDIFDDVEVGKQILVDDGKLGLTVFAKDKDTRE FEWVENDGLIG 
KQKGVNI PYTKI PFPALAERDNADIRFGLEQGLNFIAI S FVRTAKDVNEVRAI CEETGNG 
HVKLFAKIENQQGI DNI DEI IEAADGIMIARGDMGIEVPFEMVPVYQKMIITKVNAAGKA 
VITATNMLETMTDKPRATRSEVSDVFNAVIDGTDATMLSGESANGKYPVESVRTMATIDK 
NAQTLLNEYGRLDSSAFPRNNKTDVIASAVKDATHSMDIKLWTITETGNTARAISKFRP 
DADILAVTFDEKVQRSLMINWGVIPVLADKPASTDDMFEVAERVALEAGLVESGDNIVIV 
AGVPVGTGGTNTMRVRTVK 

SEQ ID NO. 7301 
STRAIN 2603 

TTGTCTGCTATAATAGACAAAAAGGTGGTGATATTTATGTATTTAGCATTAATCGGTGAT 
AT C AT T AAT T C AAAAC AG AT ACT T G AAC GT G AAACT T T C C AAC AG T C T T T T C AG C AACT A 
ATGACCGAACTATCTGATGTATATGGTGAAGAGCTGATTTCTCCATTCACTATTACAGCT 
GGTGATGAATT T CAAG CTT T AT T G AAAC C AT CAAAAAAGGT ATTT CAAAT T ATT GAC CAT 
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ATTCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCGGTACAGGAAACATTATA 
ACATCCATCAATTCAAATGAAAGTATCGGTGCTGATGGTCCTGCCTACTGGCATGCTCGC 
T C AGCT AT T AAT CAT AT AC AT GAT AAAAAT GAT TAT G G AAC AG T T C AAGT AG CT AT T T G C 
C TT GAT GAT GAAGAC C AAAAC CT T G AAT T AAC ACT AAAT AGT CT CAT T T C AGCT G GT GAT 
T T TAT C AAGT C AAAAT GGACT AC AAAC CAT T T T C AAAT GCT T G AGC ACT T AAT AC T T C AA 
GATAATT AT C AAG AAC AAT T T C AAC AT CAAAAGTTAGC CCAACT GGAAAAT ATTGAACCT 
AG T G CG CT G ACT AAACGC CT T AAAG C AAG C GG T C T GAAGAT T T ACT T AAGAAC GAGAAC A 
CAGGCAGCCGATCTATTAGTTAAAAGTTGCACTCAAACTAAAGGGGGAAGCTATGATTTC 

SEQ ID NO. 7302 

STRAIN 090 

TCTGCTATAATAGACAAAAAGGTGGTGATATTTATGTATTT 
AG CAT T AAT C GGT GAT AT CAT T AAT T C AAAAC AG AT AC T T GAAC GT G AAA 
CT T T C C AAC AGT CT T T T C AG C AAc T AAT GAC CG AACT AT c T G AT GT AT AT 
GGTGAAGAGCTGATTTCTCCATTCACTATTACAGCTGGTGATGAATTTCA 
AGCTT T AT T G AAAC CAT CAAAAAAGGT ATT T C AAAT T ATT GAC CATATTC 
AACT AGCT CT AAAAC CTGTTAATGTAAGGTTCGGCCTCGGtACAGGAAAC 
ATTAT AAC AT CC AT CAATTT AAAT GAAAGT AT CGGT GCTGATGGT C CT GC 
CTACTGGCATGCTCGCTCAGCTATTAATCATATACATGATAAAAATGATT 
ATGGAACAGTTCAAGTAGCTATTTGCCTTGATGATGAAGACCAAAACCTT 
GAAT T AAC ACT AAAT AGT C T CAT T T C AG C T GG T GAT T T T AT C AAGT C AAA 
AT GG AC T AC AAAC CAT T T T C AAAT G CT T GAG C ACT T AAT AC T T C AAG AT A 
AT TAT C AAG AAC AAT T T C AAC AT C AAAAGT T AG C C C AACT G G AAAAT AT T 
GAAC C T AGT G CG C T G AC T AAAC GC C T T AAAG C AAGC G GT C T GAAGAT T T A 
CT T AAG AAC GAG AAC AC AG G C AG C C GAT C TAT TAG T T AAAAG T T G C AC T C 
AAACTAAAGGGGGAAGCTATGATTTC 

SEQ ID NO. 7303 

STRAIN A909 

T CT G C T AT AAT AG AC AAAAAG GT GGT GAT AT T T AT G T AT 

TT AGCATT AAT CGGT GAT AT CAT T AAT T CAAAAC AGAT ACT T GAAC GT GA 

AACTTTCCAACAGTCTTTTCAGCAACTAATGACCGAACTATCTGATGTAT 

AT GGT G AAG AG CT G AT T T C T C CAT T C AC TAT T AC AG C T G GT GAT GAAT T T 

CAAGCTTTATTGAAACCATCAAAAAAGGTATTTCAAATTATTGACCATAT 

TCAACTAGCTCT AAAAC CTGTTAATGTAAGGTTCGGCCT CGGT AC AGGAA 

AC ATT AT AACAT CCAT C AATT CAAAT GAAAGTATCGGTGCT GAT GGT CCT 

G CCT AC T G G CAT GCT CG C T C AG C TAT T AAT CAT AT AC AT GAT AAAAAT G A 

T T AT GG AAC AGT T C AAG TAG C T AT T T G C C T T GAT GAT GAAGAC CAAAAC C 

TTGAATTAACACT AAAT AGT CTCATTTC AGCT GGTGATTTT AT CAAGTCA 

AAAT GG AC T AC AAAC CAT T T T CAAAT G CT T GAG C AC T T AAT AC T T C AAG A 

T AAT TAT C AAG AAC AAT T T C AAC AT C AAAAGT TAG C C C AAC T G G AAAAT A 

TTGAACCTAGTGCGCTGACTAAACGCCTTAAAGCAAGCGGTCTGAAGATT 

TACTTAAGAACGAGAACACAGGCAGCCGATCTATTAGTTAAAAGTTGCAC 

TCAAACTAAAGGGGGAAGCTATGATTTC 

SEQ ID NO. 7304 

STRAIN H3 6B 

TCTGCTATAATAGACAAAAAGGTGGTGATATTT 

ATGTATTTAGCATTAATCGGTGATATCATTAATTCAAAACAGATACTTGA 
ACGTGAAACTTTCCAACAGTCTTTTCAGCAACTAATGACCGAACTATCTG 
AT GT AT AT GGT G AAG AG CT G AT T T C T C CAT T C ACT AT T AC AG C T G GT GAT 
GAAT T T C AAG CT T T AT T G AAAC CAT C AAAAAAG GT AT T T CAAAT TAT T G A 
CCAT AT TCAACTAGCTCT AAAAC CTGTTAATGTAAGGTTCGGCCT CGGT A 
C AG GAAAC AT TAT AAC AT C CAT C AAT T CAAAT G AAAG T AT C G GT G CT GAT 
GGTCCTGCCTACTGGCATGCTCGCTCAGCTATTAATCATATACATGATAA 
AAA.TGATTATGGAAC AGT T C AAGT AGCT AT TT GCCT T GATGATGAAGACC 
AAAAC CT T G AAT T AAC AC T AAAT AGT C T CAT T T C AG C T GGT GAT T T T AT C 
AAGTCAAAATGGACTACAAACCATTTTCAAATGCTTGAGCACTTAATACT 
T C AAGAT AAT TAT C AAG AAC AAT T T C AAC AT C AAAAGTT AGCC C AACT GG 
AAAAT AT T G AAC CT AGT GC G CT G ACT AAAC G C CT T AAAG C AAG C G GT C T G 
AAG ATT T AC T T AAGAAC G AG AAC AC AGG C AG C C GAT C T AT T AG T T AAAAG 
TTGCACTC AAACTAAAGGGGGAAGCTATGATTTC 

SEQ ID NO. 7305 
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STRAIN 18RS21 

TCTGCTATAATAGACAAAAAGGTGGTGATATTT 

ATGTATTTAGCATTAATCGGTGATATCATTAATTCAAAACAGATACTTGA 
ACGTGAAACTTTCCAACAGTCTTTTCAGCAACTAATGACCGAACTATCTG 
AT GT AT AT G GT GAAGAG C T GAT T T CT C CAT T C AC TAT T AC AG CT GGT GAT 
GAATTTCAAGCTTTATTGAAACCATCAAAAAAGGTATTTCAAATTATTGA 
CCATATTCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCGGTA 
CAGGAAACATTATAACATCCATCAATTCAAATGAAAGTATCGGTGCTGAT 
GGTCCTGCCTACTGGCATGCTCGCTCAGCTATTAATCATATACATGATAA 
AAAT GAT TAT GGAAC AGT T CAAGT AGCT ATT TGCCTTGATGATGAAGACC 
AAAACCTTGAATTAACACTAAATAGTCTCATTTCAGCTGGTGATTTTATC 
AAG T C AAAAT G G AC T AC AAAC CAT T T T C AAAT GC T T GAG C AC T T AAT AC T 
T C AAG AT AAT T AT C AAGAAC AAT T T C AAC AT C AAAAGT T AGC C C AACT G G 
AAAAT ATTGAACCT AGT GCGCTG ACT AAACGCCTTAAAGCAAGC GGT CTG 
AAG AT T T ACT T AAG AAC GAG AAC AC AG G C AG C CG AT C T AT T AGT T AAAAG 
TTGCACTCAAACTAAAGGGGGAAGCTATGATTTC 

SEQ ID NO. 7306 

STRAIN M732 

T CT G CT AT AAT AGAC AAAAAGGT G G TG AT AT T 

TATGTATTTAGCATTAATCGGTGATATCATTAATTCAAAACAGATACTTG 
AAC GT G AAACT T T C C AAC AGT C T T T T C AG C AACT AAT G AC C G AAC TAT CT 
GATGTATATGGTGAAGAGCTGATTTCTCCATTCACTATTACAGCTGGTGA 
T G AAT T T C AAG C T T TAT T G AAAC a AT C AAAAAAG GT AT T T C AAAT TAT T G 
ACCATATTCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCGGT 
AC AGG AAAC AT TAT AAC AT C CAT C AAT T C AAAT G AAAGT AT C G GT G C T G A 
TGGTCCTGCCTACTGGCATGCTCGCTCAGCTATTAATCATATACATGATA 
AAAAT GAT TAT G G AAC AG T T C AAG T AG CT AT T T G C C T T GAT GAT G AAG AC 
C AAAAC CTT G AAT T AAC AC T AAAT AGT C T CAT T T C AG CT GGT GAT T T TAT 
C AAGT C AAAAT GGACT AC AAAC C ATTTTC AAAT GCTTG AGC ACT T AAT AC 
T T C AAG AT AAT TAT C AAGAACAATTT CAACAT C AAAAGT T AGCC CAACTG 
G AAAAT AT T G AAC CT AGT GCGC T G ACT AAAC G C C T T AAAG C AAG C G G T C T 
G AAG AT T T ACT T AAGAAC GAG AAC AC AGG C AG C C G AT CT AT T AGT T AAAA 
GTTGCACTC AAACT AAAGGGGGAAGCTATG AT TTC 

SEQ ID NO. 7307 

STRAIN COH1 

T CT GCT AT AAT AGAC AAAAAGGT GGT GAT AT T 

T AT GT AT T T AGC AT T AAT CGGT GAT AT CAT T AAT T C AAAAC AG AT AC T T G 
AAC G T G AAAC T T T C C AAC AGT C T T T T C AG C AACT AAT G AC C G AACT AT CT 
GATGTATATGGTGAAGAGCTGATTTCTCCATTCACTATTACAGCTGGTGA 
T G AAT T T C AAG CT T T AT T G AAAC a AT C AAAAAAG GT AT T T C AAAT TAT T G 
ACCATATTCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCGGT 
AC AG G AAAC AT TAT AAC AT C CAT C AAT T C AAAT G AAAGT AT CGGT G CT GA 
TGGTCCTGCCTACTGGCATGCTCGCTCAGCTATTAATCATATACATGATA 
AAAAT GAT T AT G G AAC AG T T CAAGT AG C T AT T T G C CT T GAT GAT GAAG AC 
C AAAAC CTT G AAT T AAC ACT AAAT AG T C T CAT T T C AG C T G G T GAT T T TAT 
CAAGT C AAAAT GG AC T AC AAAC CAT T T T C AAAT G C T T GAG C ACT T AAT AC 
T T C AAG AT AAT TAT C AAG AAC AAT T T CAACAT C AAAAG T T AG C C C AAC T G 
G AAAAT AT T G AAC C T AGT G C G C T G AC T AAAC G C CT T AAAG C AAG C GG T CT 
GAAG AT T T AC T T AAG AAC GAG AAC AC AGG C AG C C GAT C T AT T AGT T AAAA 
GT T G C ACT C AAAC T AAAG GGGG AAG C T AT GAT TTC 

SEQ ID NO. 7308 

STRAIN M7 81 

TCTGCTATAATAGACAAAAAGGTGGTGATATTT 

ATGTATTTAGCATTAATCGGTGATATCATTAATTCAAAACAGATACTTGA 
ACGT G AAAC T T T C C AAC AG T C T T T T C AG C AAC T AAT G AC C G AAC TAT CTG 
ATG T AT AT GGT GAAGAG C T GAT T T C T C C AT T C ACT ATT AC AG C T G G T GAT 
GAATTTCAAGCTTTATTGAAACAATCAAAAAAGGTATTTCAAATTATTGA 
CCAT ATT CAACT AGCT CT AAAAC CTGTT AAT GTAAGGTTCGGCCT CGGT A 
C AGG AAAC AT TAT AAC AT C CAT C AAT T C AAAT G AAAGT AT CGGT G CT G AT 
GGTCCTGCCTACTGGCATGCTCGCTCAGCTATTAATCATATACATGATAA 
AAAT GAT TAT GGAAC AGT T C AAG T AG C T AT T T G C C T T GAT GAT GAAG AC C 
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AAAACCTTGAATTAACACTAAATAGTCTCATTTCAGCTGGTGATTTTATC 
AAG T C AAAAT G GAC T AC AAAC CAT T T T C AAAT G C T T G AGC ACT T AAT AC T 
T CAAGAT AAT TAT CAAGAAC AAT T T CAACAT CAAAAGTTAGC CCAACT GG 
AAAATATTGAACCTAGTGCGCTGACTAAACGCCTTAAAGCAAGCGGTCTG 
AAG AT T T AC T T AAG AAC G AGAAC AC AGG C AG C CGAT CT AT T AGT T AAAAG 
T T G C ACT C AAACT AAAG GG G G AAG C T AT GAT T TC 

SEQ ID NO. 7309 

STRAIN CJB110 

T CT GCT AT AATAGAC AAAAAGGT GGT GGTA 

TTTATGTATTTAGCATTAATCGGTGATATCATTAATTCAAAACAGATACT 
T G AAC GT G AAACT T T C C AAC AGT CT T T T CAG C AAC T AAT GAC C G AAC TAT 
CTGATGTATATGGTGAAGAGCTGATTTCTCTATTCACTATTACAGCTGGT 
GATGAATTTCAAGCTTTATTGAAACCATCAAAAAAGGTATTTCAAATTAT 
TGACCATATTCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCG 
GT AC AGGAAACAT TAT AAC AT C CAT C AAT TC AAAT GAAAGT AT CGGTGC T 
GAT GGTCCTGC C T AC T GG C AT G C T C G C T C AGC TAT T AAT CAT AT AC AT G A 
T AAAAAT GATT ATGGAACAGTT CAAGT AGCTATT TGCCTT GATGAT GAAG 
ACCAAAACCTTGAATTAACACTAAATAGTCTCATTTCAGCTGGTGATTTT 
ATCAAGTCAAAATGGACTACTAACCATTTTCAAATGCTTGAGCACTTAAT 
ACT T CAAGAT AAT TAT CAAGAAC AAT T T CAACAT C AAAAGT T AGC C C AAC 
TGGAAAATATTGAACCTAGTGCGCTGACTAAACGCCTTAAAGCAAGCGGT 
CTGAAGATTTACTTAAGAACGAGAACACAGGCAGCCGATCTATTAGTTAA 
AAGT TGCACTC AAACT AAAGGGGGAAGCTATGATTTc 

SEQ ID NO. 7310 

STRAIN JM9130013 

T C T G C TAT AAT AG AC AAAAAGGT GGT GAT AT T T 

ATGTATTTAGCATTAATCGGTGATATCATTAATTCAAAACAGATACTTGA 
AC GT G AAAC T T T C C AAC AGT C T T T T CAG C AAC T AAT GAC C G AAC TAT C T G 
ATGTATATGGTGAAGAGCTGATTTCTCCATTCACTATTACAGCTGGTGAT 
GAAT TT CAAGCTTT ATTGAAACCAT CAAAAAAGGT ATTT CAAATT ATT GA 
CCATATTCAACTAGCTCTAAAACCTGTTAATGTAAGGTTCGGCCTCGGTA 
CAG G AAAC AT TAT AAC AT C C AT C AAT T C AAAT GAAAGT AT C G G T G CT G AT 
GGTCCTGC C T AC T G G CAT G C T CG CT CAG C T AT T AAT CAT AT AC AT G AT AA 
AAATGATTATGGAACAGTT CAAGTAG CT AT T TGC CTT GATGAT GAAGACC 
AAAACC T T GAAT T AAC ACT AAAT AGT C T CAT T T CAG CT G G T GAT T T TAT C 
AAGT C AAAAT GGACT AC AAAC CAT T T T C AAAT G C T T GAG C ACT T AAT AC T 
T CAAGAT AAT TATCAAGAACAAT TT CAACAT CAAAAGTTAGCCC AACT GG 
AAAATATTGAACCTAGTGCGCTGACTAAACGCCTTAAAGCAAGCGGTCTG 
AAG AT T T AC T T AAG AAC GAG AAC AC AG G CAG C C GAT C T AT T AGT T AAAAG 
TTGCACTC AAACT AAAGGGGGAAGCTATGATTTC 

SEQ ID NO. 7311 
STRAIN 2603 frame: 1 

LSAIIDKKVVIFMYLALIGDIINSKQILERETFQQSFQQLMTELSDVYGEELISPFTITA 
G DE FQALLKP S KKVFQ 1 1 DH I QLALKP VN VR FGLGTGN 1 1 T S IN S NE S I GADG P AYWHAR 
SAINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQ 
DNYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7312 

STRAIN 090 frame: 1 

SAIIDKKWIFMYLALIGDIINSKQILERETFQQSFQQLMTELSDVYGEELISPFTITAG 
DE FQALLKP SKKVTQI I DHIQLALKPVNVRFGLGTGNI IT SINLNE SI GADG PAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7313 

STRAIN A90 9 frame: 1 

S AI I DKKWI FMYLALI GDI INSKQILERETFQQS FQQLMTELS DVYGEELI S PFT ITAG 
DEFQALLKPSKKVFQI I DHIQLALKPVNVRFGLGTGNI ITS INSNESIGADGPAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGJDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 
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SEQ ID NO. 7314 

STRAIN H36B frame: 1 

S AI I DKKWI FMYLAL IGD I IN SKQ I LERET FQQS FQQLMTE LS DVYGEE L I S PFT ITAG 
DEFQALLKPSKKVFQIIDHIQLALKPVNVRFGLGTGNIITSINSNESIGADGPAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7315 

STRAIN 18RS21 frame: 1 

SAI I DKKWI FMYLALI GDI IN SKQ I LERET FQQS FQQLMTE LSDVYGEELIS PFT ITAG 
DEFQALLKPSKKVFQIIDHIQLALKPVNVRFGLGTGNIITSINSNESIGADGPAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7316 

STRAIN M732 frame: 1 

SAI IDKKWI FMYLALIGDI INSKQI LERET FQQS FQQLMTELS DVYGEELI S PFTITAG 
DE FQ ALLKQ S KKV FQ 1 1 DH I QL ALKP VN VR FGLGTGN 1 1 T S IN S NE S I G ADG P AYWH ARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7317 

STRAIN COH1 frame: 1 

SAI I DKKWI FMYLALI GDI INSKQI LERETFQQS FQQLMTELS DVYGEELI S PFTITAG 
DEFQALLKQSKKVFQIIDHIQLALKPVNVRFGLGTGNIITSINSNESIGADGPAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7318 

STRAIN M7 81 frame: 1 

SAI I DKKWI FMYLALIGDI INSKQI LERETFQQS FQQLMTELS DVYGEELI SPFTITAG 
DEFQALLKQSKKVFQIIDHIQLALKPVNVRFGLGTGNIITSINSNESIGADGPAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7319 

STRAIN CJB110 frame: 1 

SAIIDKKVWFMYLALIGDIINSKQILERETFQQS FQQLMTELS DVYGEELI SLFTITAG 
DEFQALLKPSKKVFQIIDHIQLALKPVNVRFGLGTGNIITSINSNESIGADGPAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7320 

STRAIN JM9130013 frame: 1 

SAI IDKKWI FMYLALIGDI INSKQILERETFQQS FQQLMTELS DVYGEELIS PFTITAG 
DEFQALLKPSKKVFQIIDHIQLALKPVNVRFGLGTGNIITSINSNESIGADGPAYWHARS 
AINHIHDKNDYGTVQVAICLDDEDQNLELTLNSLISAGDFIKSKWTTNHFQMLEHLILQD 
NYQEQFQHQKLAQLENIEPSALTKRLKASGLKIYLRTRTQAADLLVKSCTQTKGGSYDF 

SEQ ID NO. 7401 
STRAIN 2603 

AT GGAAATGC AAGT T C AAAAAAGT TTT AAAT CAAAT AT AC ATT AC GGAAC ACT CT AT 
C T AGT C C C AACT C C AAT T GGT AAT C TAG AT GAT AT G ACT T TT C GT G C CAT TAG GAT T T T A 
AG AG AAGT T GAT T T T AT T T GT G C AG AGG AT AC AC G AAAT ACGG G AC T T T T AC T C AAG C AC 
TTT GAT AT T AC TACT AAAC AAAT T AGT TTT C AC G AAC AC AAT G CT T AC G AT AAAAT C T CT 
G GG T T AAT T GAT T T GT TAAAAG AAG GG AAAT C T T T AG C C C AAG TAT C T GAT GC AGG AAT G 
CCCTCTATTTCTGACCCAGGACATGACCTTGTCAAGGCTGCTATTGAAGGGGATATCCCA 
GTTGTATCTATACCAGGAGCTAGCGCTGGTATTACTGCTCTCATCGCTTCAGGTTTAGCT 
CCACAACCTCATATTTTTTATGGCTTCTTACCTCGTAAGAAAGGTCAACAAATAACTTTC 
TTT GAAACAAAGC AAG AT T AC C CTGAAACACAAAT CTT T T AT GAGT C ACCGTT T CGAGT C 
T CT GAT AC G CT AAAAC AC AT G AAAG AG AT T T AC G G AGAT C G C C AAGT T GT T TT AGT AC G C 
G AAT T G AC G AAAC T CT AT G AAG AG TAT C AAAG AGG AAC C AT T AGT C AACT T TT AG AG C AT 
AT T G AAAAGGT C C CT C T C AAAG GT G AAT G CT T AAT TAT T G T T GAT G G T AAG AG AG AT AC C 
GAG C GAG T G AAAG AC AG T AGC C AAC AAG AT C C AC T AGT AT T AGT AAAAG AAT AT AT C G CT 
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AAT G GT GAT AAAAC T AAT C AAG C GAT AAAAAAAGT AG C AAAAG AAT T T AAT C T C AAT AG A 
C AAG AAC T C T AT G CT AGT T T C C AT GAT TT A 

SEQ ID NO. 7402 
STRAIN 0 90 

GAAATGCAAGTTCAAAAAAGTTTTAAATCAAATACACATTACGGGACACT 
CT AT C T AGT C C C AAC T C C AAT T GG T AAT CT AGAT GAT AT GAG TTTTCGTG 
C CAT TAG GAT T T T AAGAGAAGT T GAT T T TAT T T GT G C AG AGG AT AC ACG A 
AATACGGGACTTTTACTCAAGCACTTTGATATTACTACTAAACAAATTAG 
TTTTCACGAACACAATGCTTACGATAAAATCTCTGGGTTAATTGATTTGT 
T AAAAGAAGG G AG AT C T T T AGC C C AAGT AT C T GAT G C AG G AAT G C C C T CT 
AT T T C T GAC C C AG g AC AT G AC C T T GT C AAGG C T G CT AT T G AAG GGGG GAT 
CCCGGTCGTATCTATACCAGGAGCTAGCGCTGGTATTACTGCTCTCATCG 
CTTCAGGTTTAGCTCCACAACCTCATATTTTTTATGGCTTCTTACCGCGT 
AAG AAAGGT C AAC AAAT AACT T T T T T T G AAAC AAAGAAAGAT T AC C C T G a 
AACACAAATCTTTTATGAGTCACCGtTTCGAGTCTcTGATACGCTAAAAC 
ACATGAAAGAGATTTACGGAGATCGCCAAGTTGTTTTAGTACGCGAATTG 
AC G AAa C T C T AT G AAG AGT AT C AAAG AGG AAC CAT T AGT C AAC T T T TAG G 
G CAT AT T G AAAAAGT C C CT CT C AAAGG T G AAT G CT T AAT T AT T GT T GAT G 
GTAAGAGAGATACCGAGCGAGTGAAAGACAGTAGCCAACAAGATCCACTA 
GT AT T AGT AA 

SEQ ID NO. 7403 

STRAIN A909 

AGT T C AAAAAAG T T T T AAAT C AAAT AT AC AT T AC G G AAC ACT CT AT C T AG 
TCCCAACTCCAATTGGTAATCTAGATGATATGACTTTTCGTGCCATTAGG 
AT T T T AAG AG AAGT T GAT T T T AT T T GT G C AG AG GAT AC AC G AAAT AC G G G 
AC T T T T AC T C AAGC ACT T T GAT AT TAG T AC T AAAC AAAT T AGT T T T C AC G 
AACACAATGCTTACGATAAAATCTCTGGGTTAATTGATTTGTTAAAAGAA 
GGGAAATCTTTAGCCCAAGTATCTGATGCAGGAATGCCCTCTATTTCTGA 
CCCAGGACATGACCTTGTCAAGGCTGCTATTGAAGGGGATATCCCAGTTG 
TATCTATACCAGGAGCTAGCGCTGGTATTACTGCTCTCATCGCTTCAGGT 
TTAGCTCCACAACCTCATATTTTTTATGGCTTCTTACCACGTAAGAAAGG 
T C AAC AAAT AACT T T C T T T g AAAC AAAG C AAG AT T AC C CT G AAAC AC AAA 
TCTTTTATGAGTCACCGTTTCGAGTCTCtGATACGCTAAAACACATGAAA 
GAG AT T T ACGG AG AT C G C C AAGT T G T T T T AGT ACG C G AAT T G ACG AAACT 
CT AT GAAG AGT AT C AAAGAG GAAC CAT T AGT C AAC T T T TAG AG CAT AT T G 
AAAAGGTCCCTCTCAAAGGTGAATGCTTAATTATTGTTGATGGTAAGAGA 
GAT AC CG AG C GAG T G AAAG AC AG TAG C C AAC AAG AT C C AC TAG TAT TAG T 
AA 

SEQ ID NO. 7404 

STRAIN H3 6B 

GAAAT GC AAGT T C AAAAAAGTT T T AAAT C AAAT AC AC AT T 
AC GG GAC ACT C TAT C T AGT C C C AAC T C C AAT T GG T AAT C TAG AT GAT AT G 
ACTTTTCGTGCCATTAGGATTTTAAGAgAAGTTGATTTTATTTGTGCAGA 
GGATACACGAAATACGGGACTTTTACTCAAGCACTTTGATATTACTACTA 
AAC AAAT T AGT T T T C AC GAAC AC AAT G C T T AT GAT AAAAT C T CT GG GT T A 
AT T GAT T T G T T AAAAG AAG G GAG AT C T T TAG C C C AAGT AT C T G AT GC AGG 
AATGCCCTCTATTTCTGACCCAGGACATGACCTTGTCAAGGCTGCTATTG 
AAGGGGATATCCCGGTCGTATCTATACCAGGAGCTAGCGCTGGTATTACT 
GCTCTCATCGCTTCAGGTTTAGCTCCACAACCTCATATTTTTTATGGCTT 
C T T AC C G CG T AAG C AAG GT C AAC AAAT AAC T T T T T T T G AAAC AAAG AAAG 
ATTACCCTGAAACACAAATCTTTTATGAGTCACCGtTTCGAGTCTCTGAT 
ACG CT AAAAC AC AT G AAAG AG AT T T AT G GAG AT C GC CAAGT T G T T T T AGT 
AC G C G AAT T G ACG AAAC T C TAT GAAG AGT AT C AAAG AGG AAC CAT T AGT C 
AACT TTTAGGGCAT ATT GAAAAGGTCCCTCTCAAAGGTGAATGCTTAATT 
AT T G T T GAT GGT AAGAG AG AT AC T GAG C G AGT G AAAG AC AGT AG C C AAC A 
AGAT CCACT AGT ATT AGT AA 

SEQ ID NO. 7405 

STRAIN 18RS21 

GAAATGC AAGT T CAAAAAAGT TT T AAAT C AAAT AT ACAT T 
ACGGAACACTCTATCTAGTCCCAACTCCAATTGGTAATCTAgATGATATG 
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AC TTTtCGTGC CAT T AGG AT T T T AAG AGAAGT T GAT T T TAT T T GT G C AGA 
Gg AT AC AC G AAAT AC G G G AC TT T T ACT C AAG C AC T T T GAT AT T AC TACT A 
AACAAATTAGTTTTCACGAACACAATGCTTACGATAAAATCTCTGGGTTA 
AT T GAT T T G T T AAAAGAAGGG AAAT C T T T AG C C C AAG TAT CT G AT G C AGG 
AATGCCCTCTATTTCTGACCCAGGACATGACCTTGTCAAGGCTGCTATTG 
AAG GGG AT AT C CC AGT T GT AT C TAT AC C AGG AG C T AG C G C T G GT AT TAG T 
GCTCTCATCGCTTCAGGTTTAGCTCCACAACCTCATATTTTTTATGGCTT 
C T T AC C ACGT AAGAAAGGT C AAC AAAT AAC T T T C t T T G AAAC AAAGC AAG 
AT T AC CC T G AAAC AC AAAT CT T T TAT G AGT C AC CG t T T C G AGT C T C T GAT 
ACG CT AAAAC AC AT G AAAG AG AT T T AC G GAG AT CG C C AAGT T G T T T T AGT 
ACGCGAATTGACGAAACT CTAT GAAGAGTAT CAAAGAGGAAC C ATT AGT C 
AACTTTTAGAGCATATTGAAAAGGTCCCTCTCAAAGGTGAATGCTTAATT 
AT T GT T G AT GGT AAG AG AG AT AC C GAG C GAGTG AAAG AC AGT AGC C AAC A 
AGATCCACTAGTATTAGTAA 

SEQ ID NO. 7406 

STRAIN M732 

G AAAT GC AAGT T C AAAAAAGT T T T AAAT C AAAT 

AT AC AT T AC G GAAC ACT C T AT CT AGT C C C AACT C C AAT T GGT AAT CT AG A 
T GAT AT G AC TTTTCGTGC C AT T AGGAT T T T AAG AG AAGT T GAT T T T AT T T 
GTGCAGAGGATACACGAAATACGGGACTTTTACTCAAGCACTTTGATATT 
ACTACT AAACAAAT T AGTT T T C ACG AACAC AAT GCTT ACGAT AAAAT CTC 
TGGGTTAATTGATTTGTT AAAAGAAGGG AAATCTTTAGCCCAAGTATCTG 
ATGCAGGAATGCCCTCTATTTCTGACCCAGGACATGACCTTGTCAAGGCT 
G CT AT T GAAGGGG AT AT C C C AGT T G TAT CT AT AC C AG G AG CT AG C G C T G G 
TATTACTGCTCTCATCGCTTCAGGTTTAGCTCCACAACCTCATATTTTTT 
ATGG CT T CT T AC C AC GT AAGAAAGG T C AAC AAAT AAC T T T C T T T GAAAC A 
AAGCAAGATTACCCTGAAACACAAATCTTTTATGAGTCACCGtTTCGAGT 
CT C T GAT AC G C T AAAAC AC AT GAAAG AG AT T T AC G GAG AT C G C C AAGT T G 
TTTTAGTACGCGAATTGACGAAACTCTATGAAGAGTATCAAAGAGGAACC 
AT T AGTCAACT T T T AGAGC AT ATT GAAAAGGT C CCT CT C AAAGGTGAATG 
CTTAATTATTGTTGATGGTAAGAGAGATACCGAGCGAGTGAAAGACAGTA 
GCCAACAAGATCCACTAGTATTAGTAA 

SEQ ID NO. 7407 

STRAIN COH1 

GAAAT GCAAGT T C AAAAAAGT T TT a AAT C AAAT AT AC ATT AC 

GGAAC ACT CTAT CT AGT CCC AACT CCAATT GGT AAT CTAGAT GAT AT GAC 

TTTTCGTGCCATTAGGATTTTAAGAGAAGTTGATTTTATTTGTGCAGAGG 

AT AC AC GAAAT AC G G GAc T T T T AC T C AAGC AC T T T GAT AT T AC TACT AAA 

CAAATT AGT TT T CACGAACACAAT GCTTACGAT AAAAT CT CT GGGT T AAT 

T GAT T T GT T AAAAGAAG GG AAAT C T T T AGC C C AAGT AT C T GAT G C AG GAA 

TGCCCTCTATTTCTGACCCAGGACATGACCTTGTCAAGGCTGCTATTGAA 

GGGGATATCCCAGTTGTATCTATACCAGGAGCTAGCGCTGGTATTACTGC 

TCTCATCGCTTCAGGTTTAGCTCCACAACCTCATATTTTTTATGGCTTCT 

TACC ACGT AAGAAAGGT CAACAAAT AACT TT CTTTGAAAC AAAGC AAGAT 

TACCCTGAAACACAAATCTTTTATGAGTCACCGtTTCGAGTCTCTGATAC 

G C T AAAAC AC AT GAAAG AG AT T T AC G GAG AT C G C C AAG T T G T TT T AG T AC 

GCGAATT GACGAAACT CTAT GAAGAGTAT CAAAGAGGAACC ATT AGT CAA 

CTTTTAGAGCATATTGAAAAGGTCCCTCTCAAAGGTGAATGCTTAATTAT 

T GT T GAT G G T AAG AG AG AT AC C GAG C G AGT G AAAGAC AGT AG C C AAC AAG 

ATCCACTAGTATTAGTAA 

SEQ ID NO. 7408 

STRAIN M7 81 

AAATGC AAGT T C AAAAAAGT T TT AAAT C AAAT AT AC AT TACGGAACACT C 
TATCTAGTCCCAACTCCAATTGGTAATCTAGATGATATGACTTTTCGTGC 
CAT TAG GAT T T T AAG AG AAG T T GAT T T TAT T T GT G C AG AGG AT AC AC GAA 
ATACGGgACTTTTACTCAAGCACTTTGATATTACTACTAAACAAATTAGT 
TTTCACGAACACAATGCTTACGATAAAATCTCTGGGTTAATTGATTTGTT 
AAAAGAAG GG AAAT C T T TAG C C C AAGT AT C T GAT G C AG GAAT G C C C T c T A 
TTTCTGACCCAGGACATGACCTTGTCAAGGCTGCTATTGAAGGGGATATC 
CCAGTTGTATCTATACCAGGAGCTAGCGCTGGTATTACTGCTCTCATCGC 
TTCAGGTTTAGCTCCACAACCTCATATTTTTTATGGCTTCTTACCACGTA 
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AG AAAG GT C AAC AAAT AACT T T CT T T GAAAC AAAG C AAGAT T AC C C T G AA 
ACACAAAT CT TTT AT GAGT CACCGTTT CGAGT cT cTGAT ACGCTAAAAC A 
CAT G AAAG AG AT T T AC GGAGAT CGC C AAGT T GT T T TAG TAG G CGAAT T G A 
CG AAAC T C TAT G AAG AGT AT C AAAGAG GAAC CAT T AGT C AAC T T T T AGAG 
C AT ATT G AAAAGGT C C C T C T C AAAGGT G AAT G CT T AAT TAT T GT T GAT G G 
T AAGAGAGAT AC C GAG C G AGT GAAAG AC AGT AGC C AAC AAGAT C C AC TAG 
TATTAGTAA 
A 

SEQ ID NO. 7409 

STRAIN CJB110 

G AAAT GC AAGT T C AAAAAAGT T T T AAAT C AAAT AC AC AT T AC GGG AC AC 
T CT AT CT AGT CCCAACT C CAATTGGT AATCTAGATGAT AT GACTTTT CGT 
GC CAT T AGG AT T T T AAG AG AAGT T GAT T T TAT T T GT G C AG AG G AT AC ACG 
AAAT AC G G G AC TTT TACT C AAG C AC T T T GAT AT T AC T AC T AAAC AAAT T A 
GTTTTCACGAACACAATGCTTACGATAAAATCTCTGGGTTAATTGATTTG 
T T AAAAG AAG GGAGAT C T T T AGC C C AAG TAT C T GAT G C AGG AAT GC C C T C 
TAT T T CT G AC C C AG G AC AT GAC C T T GT C AAG G C T GC TAT T G AAGGGG GG A 
T CC CGGT C GT AT C TAT AC C AGG AG C TAG C G CT G GT AT T AC T G C T C T CAT C 
GCTTCAGGTTTAGCTCCACAACCTCATATTTTTTATGGCTTCTTACCGCG 
T AAGAAAG GT C AAC AAAT AAC T T T t T T T GAAAC AAAG AAAGAT T AC C CT G 
AAACACAAATCTtTTATGAGTCACCGtTTcGAGTCTCTGATACGCTAAAA 
C AC AT GAAAG AG AT T T ACG GAG ATC GC C AAGT T G T T T T AGT AC G C G AAT T 
GACG AAAC T C TAT G AAGAG T AT C AAAG AG GAAC CAT T AGT C AACT T T T AG 
GGCATATTGAAAAAGTCCCTCTCAAAGGTGAATGCTTAATTATTGTTGAT 
GGT AAGAG AGAT AC C GAG C GAGT GAAAG AC AGT AGC C AAC AAG AT C C AC T 
AG TAT T AG T AA 

SEQ ID NO. 7410 

STRAIN 1169NT 

TGCAAGTT CAAAAAAGT T TT AAAT C AAAT ACAC AT T ATGGGACACT CT AT 
CT AGT CCCAACT CCAATTGGTAATCTAGAT GAT AT GACTTTTCGTGCC AT 
TAG GAT T T T AAG Ag AAGT T G a T T T T AT T T G T G C AG AGG AT AC AC G AAAT A 
C GG G ACT T T T AC T C AAG C AC TTT GAT a T T AC T AC T AAAC AAAT T AG t TTT 
c ACG AAC AC AAT GC T T AC G AT AAAAT CTCTGGGT T AAT T GAT T t GT T AAA 
AGAAGGGAAAT CTTT AGCC C AAGTATCT GATGCAGGAATGCC CT CT AT T T 
C T G ACC C AGG AC AT GAC C T T GT C AAG G C T G C TAT T GAAG GGG AT AT C C C A 
GTTGTATCTATACCAGGAGCTAGCGCTGGTATTACTGCTCTCATCGCTTC 
AGG T T TAG C T C C AC AAC C T CAT AT T T T T TAT G G C T T C T T AC C AC GT AAG A 
AAGGT C AAC AAAT AACT T T T T T T GAAAC AAAG C AAGAT TAT C C T GAAAC A 
CAAAT CTT TT ATGAGT CACCGtTTCG AG TCT CTGAT ACGCTAAAAC AC AT 
GAAAG AG AT T T AC GGAGAT CGC C AAGT T GT T T T AG T AC G C G AAT T GAC gA 
AAC T C T AT GAAG AG TAT C AAAG AGG AAC CAT T a G T C AACT T T TAG AG CAT 
ATTGAAAAGGTCCCTCTCAAAGGTGAATGCTTAATTATTGtTGATGGTAA 
GAG AG At a C C GAG C GAG T GAAAG AC AG TAG C C AAC AAG AT C C ACT AG TAT 
TAGTAA 

SEQ ID NO. 7411 

STRAIN JM9130013 

G AAAT G C AAGT T C AAAAAAG TTT T AAAT C AAAT AC AC AT T AC G GG A 

C ACTCTATCTAGTCCC AACT CCAATTGGTAATCTAgATG AT AT GACTTTT 

C GT G C CAT T AGG AT T T T AAG AG AAGT T GAT T T TAT T T G T G C AG AGGAT AC 

ACGAAATACGGGACTTTTACTCAAGCACTTTGATATTACTACTAAACAAA 

TTAGTTTTCACGAACACAATGCTTATGATAAAATCTCTGGGTTAATTGAT 

TTGTTAAAAGAAGGGAGATCTTTAGCCCAAGTATCTGATGCAGGAATGCC 

CTCTATTTCTGACCCAGGACATGACCTTGTCAAGGCtGCTATTGAAGGGG 

ATATCCCGGTCGTATCTATACCAGGAGCTAGCGCTGGTATTACTGCTCTC 

ATCGCTTCAGGTTTAGCTCCACAACCTCATATTTTTTATGGCTTCTTACC 

G C GT AAGC AAG GT C AAC AAAT AAC t T T T T T T GAAAC AAAG AAAG AT T AC C 

CTGAAACACAAATCTTTTATGAGTCACCGTTTCGAGTCTCTGATACGCTA 

AAAC AC AT GAAAG AG AT T TAT G GAG AT CGC C AAGT T G T T T T AG T AC G C G A 

AT T GAC GAAAC T C TAT GAAG AG TAT C AAa G AGG AAC C AT T AGT C AAC T T T 

TAGGGCATATTGaAAAGGTCCCTCTCAAAGGTGAATGCTTAATTATTGTT 

GAT G G T AAG AG AG AT AC T GAG C GAG T GAAAG AC AGT AG C C AAC AAG AT C C 
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AGTAGTATTAGTAA 

SEQ ID NO. 7412 
STRAIN 2603 frame: 1 

MEMQVQKSFKSNIHYGTLYLVPTPIGNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHF 
DITTKQI S FHEHNAYDKISGLI DLLKEGKSLAQVS DAGMPS I SDPGHDLVKAAIEGDI PV 
VS I PGASAGITALIASGLAPQPHI FYGFLPRKKGQQIT FFETKQDYPETQI FYES PFRVS 
DTLKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLEHIEKVPLKGECLIIVDGKRDTE 
RVKDSSQQDPLVLVKEYIANGDKTNQAIKKVAKEFNLNRQELYASFHDL 

SEQ ID NO. 7413 

STRAIN 090 frame: 1 

EMQVQKS FKSNTHYGTLYLVPTPIGNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHFD 
ITTKQI S FHEHNAYDKI SGLI DLLKEGRSLAQVSDAGMPSI S DPGHDLVKAAIEGGI PW 
SIPGASAGITALIASGLAPQPHIFYGFLPRKKGQQITFFETKKDYPETQIFYESPFRVSD 
TLKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLGHIEKVPLKGECLIIVDGKRDTER 
VKDSSQQDPLVLV 

SEQ ID NO. 7414 

STRAIN A909 frame: 2 

VQKSFKSNIHYGTLYLVPTPIGNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHFDITT 
KQI S FHEHNAYDKISGLI DLLKEGKSLAQVS DAGMPS I S DPGHDLVKAAIEGDI PWS IP 
GASAGITALIASGLAPQPHIFYGFLPRKKGQQITFFETKQDYPETQI FYES PFRVS DTLK 
HMKEIYGDRQVVLVRELTKLYEEYQRGTISQLLEHIEKVPLKGECLIIVDGKRDTERVKD 
SSQQDPLVLV 

SEQ ID NO. 7415 

STRAIN H3 6B frame: 1 

EMQVQKS FKSNTHYGTLYLVPTPIGNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHFD 
ITTKQI S FHEHNAYDKI SGLIDLLKEGRSLAQVS DAGMPS I S DPGHDLVKAAIEGDI PW 
SI PGASAGITALIASGLAPQPHI FYGFLPRKQGQQITFFETKKDYPETQI FYES PFRVSD 
TLKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLGHIEKVPLKGECLIIVDGKRDTER 
VKDSSQQDPLVLV 

SEQ ID NO. 7416 

STRAIN 18RS21 frame: 1 

EMQVQKS FKSNIHYGTLYLVPTPIGNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHFD 
ITTKQI S FHEHNAYDKI SGLI DLLKEGKSLAQVS DAGMPS I S DPGHDLVKAAIEGDI PW 
SI PGASAGITALIASGLAPQPHI FYGFLPRKKGQQIT FFETKQDYPETQI FYES PFRVSD 
TLKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLEHIEKVPLKGECLIIVDGKRDTER 
VKDSSQQDPLVLV 

SEQ ID NO. 7417 

STRAIN M732 frame: 1 

EMQVQKS FKSNIHYGTLYLVPTPIGNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHFD 
ITTKQI S FHEHNAYDKI SGLI DLLKEGKSLAQVS DAGMPS I S DPGHDLVKAAIEGDI PW 
SI PGASAGITALIASGLAPQPHI FYGFLPRKKGQQIT FFETKQDYPETQI FYES PFRVSD 
TLKHMKEIYGDRQVVLVRELTKLYEEYQRGTISQLLEHIEKVPLKGECLIIVDGKRDTER 
VKDSSQQDPLVLV 

SEQ ID NO. 7418 

STRAIN COH1 frame: 1 

EMQVQKS FKSNIHYGTLYLVPTPIGNLDDMT FRAIRILREVDFICAEDTRNTGLLLKHFD 
ITTKQI S FHEHNAYDKI SGLI DLLKEGKSLAQVS DAGMPS I S DPGHDLVKAAIEGDI PW 
SI PGASAGITALIASGLAPQPHI FYGFLPRKKGQQIT FFETKQDYPETQI FYES PFRVSD 
TLKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLEHIEKVPLKGECLIIVDGKRDTER 
VKDSSQQDPLVLV 

SEQ ID NO. 7419 

STRAIN M7 81 frame: 3 

MQVQKS FKSNIHYGTLYLVPTPIGNLDDMT FRAIRILREVDFICAEDTRNTGLLLKHFDI 
TTKQIS FHEHNAYDKI SGLI DLLKEGKSLAQVSDAGMPS I SDPGHDLVKAAIEGDI PWS 
I PGAS AGI TALI ASGLAPQPHI FYGFLPRKKGQQITFFETKQDYPETQI FYE S P FRVS DT 
LKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLEHIEKVPLKGECLIIVDGKRDTERV 
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KDSSQQDPLVLV 

SEQ ID NO. 7420 

STRAIN CJB110 frame: 1 

EMQVQKSFKSNTHYGTLYLVPTPIGNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHFD 
ITTKQISFHEHNAYDKISGLIDLLKEGRSLAQVSDAGMPSISDPGHDLVKAAIEGGIPW 
SIPGASAGITALIASGLAPQPHIFYGFLPRKKGQQITFFETKKDYPETQIFYESPFRVSD 
TLKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLGHIEKVPLKGECLIIVDGKRDTER 
VKDSSQQDPLVLV 

SEQ ID NO. 7421 

STRAIN 1169NT frame: 3 

QVQKSFKSNTHYGTLYLVPTPIGNLDDMTFRAIRILREVDFICAEDTRNTGLLLKHFDIT 
TKQI S FHEHNAYDKISGLI DLLKEGKSLAQVS DAGMPS I S DPGHDLVKAAIEGDI PWS I 
PGASAGITALIASGLAPQPHIFYGFLPRKKGQQITFFETKQDYPETQIFYESPFRVSDTL 
KHMKEIYGDRQVVLVRELTKLYEEYQRGTISQLLEHIEPCVPLKGECLIIVDGKRDTERVK 
DSSQQDPLVLV 

SEQ ID NO. 7422 

STRAIN JM9130013 frame: 1 

EMQVQKS FKSNTHYGT LYLVPT P IGNLDDMTFRAIRI LRE VDFI CAE DTRNTGLLLKHFD 
I T T KQ I S FHE HN AY DKISGLIDL LKE GRS L AQ VS D AGM PSISDPGHD L VKAA I E G D I P W 
SIPGASAGITALIASGLAPQPHIFYGFLPRKQGQQITFFETKKDYPETQIFYESPFRVSD 
TLKHMKEIYGDRQWLVRELTKLYEEYQRGTISQLLGHIEKVPLKGECLIIVDGKRDTER 
VKDSSQQDPWLV 

SEQ ID NO. 7501 
STRAIN 2603 

AT G AGCGT AT ATGTTAGT GG AAT AGGAATT AT T 

TCTTCTTTGGGAAAGAATTATAGCGAGCATAAACAGCATCTCTTCGACTTAAAAGAAGGA 
AT T T C T AAAC AT T TAT AT AAAAAT C ACG AC T CT AT T T TAG AAT C T TAT AC AG G AAG CAT A 
ACTAGTGACCCAGAGGTTCCTGAGCAATACAAAGATGAGACACGTAATTTTAAATTTGCT 
TTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATTTAAAAGCTTATCATAAT 
AT TGCTGTGTGTT T AGGGAC C T C ACT T GG GGG AAAG AGT GC T GGT C AAAAT G C C T T GT AT 
CAATTTGAAGAAGGAGAGCGTCAAGTAGATGCTAGTTTATTAGAAAAAGCATCTGTTTAC 
CATATTGCTGATGAATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCA 
AC CGCCTGTTCTG C AAG T AAT AAT G CCGT AAT AT T AGG AAC AC AAT T AC T T C AAG AT G G C 
GAT T GT G AT T T AGCT AT TTGTGGTGGCTGT GAT G AGT T AAGT GAT AT T T C T T T AG C AG GC 
T T C AC AT C AC T AGG AGCT AT T AAT AC AGAAAT GG C AT GT C AG C C C T AT T C T T C T GG AAAA 
GGAATCAATTTGGGTGAGGGCGCTGGTTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCT 
AAAT AT G G AAAAAT TAT CGGTGGTCT TAT T AC T T C AGAT G GT T AT CAT AT AAC AG C ACC T 
AAG C C AAC AG G T G AAGGG G CG GC AC AG ATT GC AAAG C AG C T AGT G AC T C AAG C AGGT AT T 
GACTACAGTGAGATTGACTATATTAACGGTCACGGTACAGGTACTCAAGCTAATGATAAA 
AT GG AAAAAAAT AT GT AT GG T AAGT T T T T C C C GAC AACG AC AT T GAT C AG C AGT AC C AAG 
G GG C AAAC G GGT CAT AC T C T AGG GG C T G C AG G TAT TAT C GAAT T GAT T AAT T GT T T AGC G 
GCAATAGAGGAACAGACTGTACCAGCAACTAAAAATGAGATTGGGATAGAAGGTTTTCCA 
G AAAAT T T T GT C T AT CAT CAAAAGAGAGAAT ACC C AAT AAG AAAT G C T T T AAAT T T T T CG 
TTTGCTTTTGGTGGAAATAATAGTGGTGTCTTATTGTCATCTTTAGATTCACCTCTAGAA 
ACATTACCTGCTAGAGAAAATCTTAAAATGGCTATCTTATCATCTGTTGCTTCCATTTCT 
AAGAATGAATCACTTTCTATAACCTATGAAAAAGTTGCTAGTAATTTCAACGACTTTGAA 
G C AT T ACG C T T T AAAG G G G CT AGAC C AC C C AAAAC T G T C AAC C C AG C AC AAT T T AGG AAA 
ATGGATGATTTTTCCAAAATGGTTGCCGTAACAACAGCTCAAGCACTAATAGAAAGCAAT 
AT T AAT C T AAAAAAAC AAG AT AC T T C AAAAGT AGG AAT T GT AT T T AC AAC AC T T T C T GG A 
C C AGT T G AGG T T G T T G AAGG TAT T G AAAAG C AAAT C AC AAC AG AAGG AT AT G C AC AT GT T 
TCTGCTTCACGATTCCCGTTTACAGTAATGAATGCAGCAGCTGGTATGCTTTCTATCATT 
T T T AAAAT AAC AG GT C CT T TAT C T GT CAT T T C GAC AAAT AGT G GAG C G CT T GAT GGT AT A 
C AAT AT GC C AAGG AAAT GAT G CGT AAC GAT AAT C TAG AC TAT G T GAT TCTTGTTTCTGCT 
AATCAGTGGACAGACATGAGTTTTATGTGGTGGCAACAATTAAACTATGATAGTCAAATG 
TTTGTCGGTTCTGATTATTGTTCAGCACAAGTCCTCTCTCGTCAAGCATTGGATAATTCT 
CCTATAATATTAGGTAGTAAACAATTAAAATATAGCCATAAAACATTCACAGATGTGATG 
ACT AT T T T T GAT GCTGCGCTT C AAAAT T TAT TAT C AGAC T TAG GAC T AAC CAT AAAAG AT 
AT C AAAG GT T T CGT T T GG AAT G AG CG G AAG AAGG C AGT T AGT T C AG AT TAT GAT T T CT T A 
GCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTGGTCAGTTTGGATTTTCA 
T C T AAT G GT G C T GGT G AAG AAC T G GAC TAT AC T GT T AAT G AAAGT AT AG AAAAGG GC T AT 
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TATTTAGTCCTATCTTATTCGATCTTCGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7502 

STRAIN 090 

AT G T T AGT G GAAT AGG AAT TAT TTCTTCTTT G GG AAAG a AT T AT 
AG C GAG C AT AAAC AG CAT C T C T T CGACT T AAAAG AAG G AAT T T C T AAAC A 
TTTATATAAAAATCACGACTCTATTTTAGAATCTTATACAGGAAGCATAA 
CT AGT GAG C C AG AG G T T C CT GAG C AAT AC AAAG AT G AG AC ACGT AAT T T T 
AAATTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAA 
TTTAAAAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGG 
GAAAGAGTGCTGGTCAAAATGCCTTGTATCAATTTGAAGAAGGAGAGCGT 
C AAG TAG AT G C T AGT T TAT T AG AAAAAG CAT CT GT T T AC CAT AT T G CT G A 
T GAAT T GAT G G C T TAT CAT GAT AT T G T GG GAG CT T C GT AT GT TAT T T C AA 
CCGCCTGTTC T GC AAGT AAT AAT GC C GT AAT AT T AG G AAC AC AAT T AC T T 
CAAGATGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAG 
T GAT AT T T CT T TAG C AGG C T T C AC AT C ACT AG GAG C T AT T AAT AC AG AAA 
TGGCATGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGC 
GCTGGTTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCTAAATATGGAAA 
AATTATCGGTGGTCTTATTACTTCAGATGGTTATCATATAACAGCACCTA 
AG C C AAC AG G T GAAG G GG CGG C AC AG AT T G C AAAG C AGCT AG T G ACT C AA 
G C AG GT AT T G AC T AC AGT GAG AT T G ACT AT AT T AAC GGT C AC GGT AC AGG 
T ACT CAAGCT AAT GAT AAAATGGAAAAAAATAT GT ATGGTAAGTTTT T C C 
C G AC AAC G AC AT T GAT C AG C AGT AC C AAGG G G C AAAC GG G T CAT ACT C T A 
GGGGCTGCAGGTATTATCGAATTGATTAATTGTTTAGCGGCAATAGAGGA 
AC AGACT GT ACCAGCAACT AAAAATGAGATT GGGAT AG AAGGTT T T CCAG 
AAAATTTTGTCTATCATCAAAAGAGAGAATACCCAATAAGAAATGCTTTA 
AATTTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTATCTTATTGTCATC 
T T T AGAT T C AC C T C TAG AAAC AT T AC C T G C T AG AG AAAAT CT T AAAAT G G 
CTATCTTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATA 
AC C TAT GAAAAAGT T G CT AGT AAT T T C AAC G AC T T T GAAG C ATT ACG C T T 
T AAAGG G G C T AGAC C AC C C AAAAC T GT C AAC C C AG C AC AAT T T AGG AAAA 
TGGATGATTTTTCCAAAATGGTTGCCGTAACAACAGCTCAAGCACTAATA 
G AAAG C AAT AT T AAT C T AAAAAAAC AAG AT AC T T C AAAAGT AGG AAT T GT 
AT T T AC AAC AC T T T C T GG AC C AGT T GAGGT T G T T G AAGG TAT T G AAAAGC 
AAAT C AC AAC AG AAG GAT AT GC AC AT GTTTCTGCTT C AC GAT TCCCGTTT 
ACAGTAATGAATGCAGC AGCT GGT AT GCTTTCT AT CATTTTT AAAAT AAC 
AGGTCCTTTATCTGTCATTTCGACAAATAGTGGAGCGCTTGATGGTATAC 
AAT AT G C C AAGG AAAT GAT G CGT AAC GAT AAT C TAG AC TAT GTG AT T CT T 
GT T T C T G CT AAT C AGT GG AC AG AC AT G AGT T T T AT GT G GT G G C AAC AAT T 
AAACTATGATAGTCAAATGTTTGTCGGTTCTGATTATTGTTCAGCACAAG 
TCCTCTCTCGTCAAGCATTGGATAATTCTCCTATAATATTAGGTAGTAAA 
C AAT T AAAAT AT AG C CAT AAAAC AT T C AC AG AT GT GAT G AC TAT T T T T G A 
TGCTGCGCT T C AAAAT T TAT TAT C AGACT TAG G AC T AAC CAT AAAAG AT A 
TCAAAGGTTT CGT TTGGAATGAGCGGAAGAAGGC AGT T AGT TC AGAT TAT 
GATTTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTC 
T GGT C AGT T T G GAT T T T CAT CT AAT G G T G CT GGT GAAG AAC T GG AC T AT A 
CTGTTAATGAAAGTATAGAAAAGGGCTATTATTTAGTCCTATCTTATTCG 
ATCTTTGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7503 

STRAIN A90 9 

ATGTTAGTGGAATAGGAATTATTTCTTCTTTGGGAAAGAATT 

AT AGCGAGC AT AAAC AGCAT C T CT T CGACT TAAAAGAAGGAAT TT CT AAA 

C AT TT AT ATAAAAAT CACGACT CT AT TTT AGAAT CTT AT AC AGGAAG CAT 

AAC TAG T GAC C C AG AG GT T C C T GAG C AAT AC AAAG AT GAG AC AC GT AAT T 

TTAAATTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTT 

AATTTAAAAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGG 

GGG AAAGAGT G CT GGT C AAAAT G C CT T GT AT C AAT T T GAAG AAGG AG AG C 

GT C AAG TAG AT G C T AG T T TAT TAG AAAAAG CAT CT GT T T AC CAT AT T G CT 

GATGAATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTC 

AACCGCCTGTTCTGCAAGTAATAATGCCGTAATATTAGGAACACAATTAC 

TTCAAGATGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTA 

AGT GAT AT T T C T T T AGC AG G CT T C AC AT C AC TAG GAG C TAT T AAT AC AG A 

AATGGCATGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGG 
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GCGCTGGTTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCTAAATATGGA 
AAAAT T AT CGGTGGTC T T AT T AC T T C AG AT G G T TAT C AT AT AAC AG C AC C 
T AAG C C AAC AG GT G AAG GGG CG G C AC AG ATT G C AAAG C AG C T AGT GAC T C 
AAG C AGGT AT T G AC T AC AGT GAG AT T G AC TAT AT T AACGGT C AC GGT AC A 
GGTACT C AAGCTAAT GAT AAAAT GGAAAAAAAT ATGTAT GGTAAGT T T TT 
C C C G AC AACG AC AT T GAT C AG C AG T AC C AAG G GG C AAACG G GT CAT AC T C 
TAGGGGCTGCAGGTATTATCGAATTGATTAATTGTTTAGCGGCAATAGAG 
GAAC AG AC T GT AC C AG CAACT AAAAAT GAG AT T G GG AT AG AAGGT T T T C C 
AGAAAAT T T T GT CT AT CAT C AAAAG AG AG AAT AC C C AAT AAG AAAT G C T T 
TAAATTTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTGTCTTATTGTCA 
T CTT T AGAT T CACCT CT AGAAAC AT T AC CTGCT AGAGAAAAT CTT AAAAT 
GGCTATCTTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTA 
TAAC CT AT GAAAAAGTT GCTAGT AATTT CAACGACTT T GAAGCATT ACGC 
T T T AAAGGGG CT AG AC C AC C C AAAACT G T C AAC C C AG C AC AAT T T AG G AA 
AATGGATGATTTTTCCAAAATGGTTGCCGTAACAACAGCTCAAGCACTAA 
TAGAAAGCAATATTAATCTAAAAAAACAAGATACTTCAAAAGTAGGAATT 
GT AT T T AC AAC ACT T T C T GG AC C AG T T GAG G T T G T T G AAGGT AT T G AAAA 
G C AAAT C AC AAC AG AAGG AT AT G C AC AT GTTTCTGCTT C AC GAT T CC C GT 
TTACAGTAATGAATGCAGCAGCTGGTATGCTTTCTATCATTTTTAAAATA 
ACAGGTCCTTTATCTGTCATTTCGACAAATAGTGGAGCGCTTGATGGTAT 
AC AAT AT G C C AAG G AAAT GAT G CGT AAC GAT AAT C TAG AC TAT G T GAT T C 
TTGTTTCT G CT AAT C AG T GGAC AGAC AT GAG T T T TAT GT G GT G G C AAC AA 
TTAAACTATGATAGTCAAATGTTTGTCGGTTCTGATTATTGTTCAGCACA 
AGTCCTCTCTCGTCAAGCATTGGATAATTCTCCTATAATATTAGGTAGTA 
AAC AAT T AAAAT AT AG C C AT AAAAC AT T C AC AG AT GT G AT GAC TAT T T T T 
GAT GCTGCGCTT C AAAAT T TAT TAT C AG AC T TAGG AC TAAC CAT AAAAG A 
TATCAAAGGTTTCGTTTGGAATGAGCGGAAGAAGGCAGTTAGTTCAGATT 
ATGATTTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCT 
TCTGGTCAGTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTA 
T ACT GT TAAT GAAAGT AT AGAAAAGGGCTAT TATT TAGT CCT AT CT T AT T 
CGATCTTCGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7504 

STRAIN H3 6B 

ATGTTAGTGGAATAGGAATTATTTCTTCTTTGGGAAAGAATTATAGCGA 
G C AT AAAC AG C AT C T CT T C GAC T T AAAAG AAG G AAT T T C T AAAC AT T T AT 
AT AAAAAT C ACGAC T C TAT T T TAG AAT C T T AT AC AGG AAGC AT AAC TAGT 
GAC C C AGAGGT T C CT GAG C AAT AC AAAG AT G AG AC ACGT AAT T T T AAAT T 
TGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATTTAA 
AAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGGGAAAG 
AGTGCTGGTCAAAATGCCTTGTATCAATTTGAAGAAGGAGAGCGTCAAGT 
AGAT G C T AG T T TAT T AG AAAAAGC AT CT G T T T AC CAT AT T G CT G AT G AAT 
TGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAACCGCC 
T GT T CT G C AAGT AAT AAT G C C GT AAT AT TAG GAAC AC AAT T AC T T C AAG A 
TGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAGTGATA 
TTTCTTTAGCAGGCTTCACATCACTAGGAGCTATTAATACAGAAATGGCA 
TGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGCGCTGG 
TTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCTAAATATGGAAAAATTA 
TCGGTGGTCT TAT T AC T T C AG ATG G T TAT CAT AT AAC AG C AC C T AAG C C A 
AC AG GT G AAGG G G C GG C AC AG ATT G C AAAG C AG C TAGT G ACT C AAG C AG G 
TAT T GAC T AC AGT GAG AT T GAC TAT AT TAAC GGT C AC GGT AC AG GT AC T C 
AAGCTAATGATAAAATGGAAAAAAATATGTATGGTAAGTTTTTCCCGACA 
AC GAC AT T GAT C AG C AGT AC C AAG GGG C AAAC G GGT CAT AC T C TAG GG G C 
TGCAGGTATTATCGAATTGATTAATTGTTTAGCGGCAATAGAGGAACAGA 
C T G T AC C AG C AAC T AAAAAT GAG AT T G G GAT AG AAG G T T T T C C AG AAAAT 
T T T G T C T AT CAT C AAAAG AG AG AAT AC C C AAT AAG AAAT G C T T T AAAT T T 
TTCGTTTGCTTTTGGTGGAAATAATAGTGGTGTCTTATTGTCATCTTTAG 
AT T C AC C T C TAG AAAC AT T AC C T G C TAG AG AAAAT C T T AAAAT G G C TAT C 
TTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATAACCTA 
TGAAAAAGTTGCTAGTAATTTCAACGACTTTGAAGCATTACGCTTTAAAG 
G GG C T AG AC C AC C C AAAAC T G T C AAC C C AG C AC AAT T TAG G AAAAT GG AT 
GAT T T T T C C AAAAT G GT T GC C GT AAC AAC AG CT C AAG C AC TAAT AG AAAG 
C AAT AT T AAT CT AAAAAAACAAG AT ACT T CAAAAGT AGG AAT T GT ATTT A 
C AAC ACT TT CT GGAC CAGTT GAGGT T GTT GAAGGT AT TG AAAAG C AAAT C 
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AC AAC AG AAG GAT AT G C AC AT GT T T C T G C T T C AC GAT T C C C GT T T AC AGT 
AAT G AAT G C AG C AGC T GGT AT GC T T T C T AT CAT T T T T AAAAT AAC AG GT C 
C T T TAT C T GT CAT T T CG AC AAAT AGT G G AGC G CT T GAT G GT AT AC AAT AT 
G C C AAG GAAAT GAT G C G T AAC GAT AAT CT AGAC T AT GT G AT TCTTGTTTC 
TGCTAATCAGTGGACAGACATGAGTTTTATGTGGTGGCAACAATTAAACT 
AT GAT AGT C AAAT GT T T GT CG GT T C T GAT TAT T GT T C AGC AC AAGT C C T C 
TCTCGTCAAGCATTGGATAATTCTCCTATAATATTAGGTAGTAAACAATT 
AAAAT AT AG C C AT AAAAC AT T C AC AG AT GT GAT GACT AT T T T T GAT G C T G 
C GC T T C AAAAT T TAT TAT C AG AC T T AGG AC T AAC C AT AAAAG AT AT C AAA 
GGTTTCGTTTG GAAT GAG CG G AAGAAGG C AGT T AGT T C AGAT TAT GAT T T 
CTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTGGTC 
AGTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTATACTGTT 
AAT G AAAG T AT AG AAAAGGG C TAT TAT T T AG T C CT AT C T T AT T CG AT C T T 
CGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7505 

STRAIN 18RS21 

ATGT T AGT GG AAT AG GAAT TAT TTCTTCTTT G GG AAAGAAT TAT AG C . 
GAGC AT AAAC AGC AT CT CTT CGACTTAAAAGAAGGAATTT CT AAAC ATTT 
AT AT AAAAAT C ACG ACT CT ATT T TAG AAT CT TAT AC AGG AAG C AT AACT A 
GTGACCCAGAGGT T CCTGAGCAAT ACAAAGAT GAGAC ACGT AAT TTTAAA 
TTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATTT 
AAAAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGGGAA 
AG AGT G CT G GT C AAAAT G C C T T GT AT C AAT T T G AAG AAGG AGAG C GT C AA 
G T AGAT G CT AGT T T AT T AGAAAAAG CAT C T G T T TAG CAT AT T G C T GAT G A 
ATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAACCG 
CCTGTTCT GC AAG T AAT AAT GC C GT AAT AT T AGG AAC AC AAT T AC T T C AA 
GATGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAGTGA 
TAT T T CT T TAG C AGGCT T C AC AT C AC T AGG AG C T AT T AAT AC AGAAAT GG 
CATGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGCGCT 
GGTTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCTAAATATGGAAAAAT 
TAT CGGTGGTCT TAT TACT T C AG AT GGT TAT CAT AT AAC AG C AC C T AAG C 
C AAC AGGTG AAG G G G C G GC AC AGAT T G C AAAG C AG C T AGT G AC T C AAGC A 
GGTATTGACTACAGTGAGATTGACTATATTAACGGTCACGGTACAGGTAC 
TC AAGCTAATGAT AAAAT GGAAAAAAAT AT GT AT GGT AAGTT TT T C CCGA 
C AAC GAC AT T GAT C AG C AGT AC C AAGGGG C AAACG G GT CAT AC T CT AGG G 
GCTGCAGGTATTATCGAATTGATTAATTGTTTAGCGGCAATAGAGGAACA 
GAC T G T AC C AG C AAC T AAAAAT GAG AT T G GG AT AG AAG GTT T T C C AGAAA 
AT T T T GT CT AT CAT C AAAAGAG AG AAT AC C C AAT AAGAAAT G C T T T AAAT 
TTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTGTCTTATTGTCATCTTT 
AGATTCACCTCTAGAAACATTACCTGCTAGAGAAAATCTTAAAATGGCTA 
TCTTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATAACC 
TATGAAAAAGTTGCTAGTAATTTCAACGACTTTGAAGCATTACGCTTTAA 
AGGG G C T AGAC C AC C C AAAAC T GT C AAC C C AG C AC AAT T T AGG AAAAT GG 
AT GAT T T T T C CAAAAT GGTTGCCGTAAC AAC AGCT C AAGCACT AAT AGAA 
AGC AAT ATT AAT CTAAAAAAACAAGATACTTC AAAAG TAGGAATTGT ATT 
TACAACACTTTCTGGACCAGTTGAGGTTGTTGAAGGTATTGAAAAGCAAA 
T C AC AAC AG AAGGAT AT G C AC AT GTTTCTGCTT C AC GAT T C C C G T T T AC A 
GTAATGAATGCAGCAGCTGGTATGCTTTCTATCATTTTTAAAATAACAGG 
T C C T T TAT C T G T CAT T T C GAC AAAT AGT G GAG C G C T T GAT GGT AT AC AAT 
AT G C C AAG GAAAT GAT G C GT AAC GAT AAT C TAG AC TAT GT GAT T CT T GT T 
T C T G C T AAT C AGT GG AC AG AC AT G AGT T T TAT GT GGT G G C AAC AAT T AAA 
CTATGATAGTCAAATGTTTGTCGGTTCTGATTATTGTTCAGCACAAGTCC 
TCTCTCGT C AAG CAT T GG AT AAT T C T C C TAT AAT AT T AGGT AGT AAAC AA 
T T AAAAT AT AG C CAT AAAAC AT T C AC AG AT G T GAT GAC TAT T T T T GAT G C 
T G C G C T T CAAAAT T TAT TAT C AG AC T T AGG AC T AAC CAT AAAAG AT AT C A 
AAG GT T T C GT T T G GAAT G AGC G G AAGAAG G C AGT T AGT T C AG AT TAT GAT 
TTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTGG 
T C AGT T T G GAT T T T CAT C T AAT GGTGCTGGT G AAG AAC T GG ACT AT AC T G 
T T AAT G AAAGT AT AG AAAAGG G CT AT TAT T T AG T C C T AT CT T AT T C G AT C 
TTCGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7506 

STRAIN M732 
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ATGTTAGTGGAATAGGAATTATTTCTTCTTTGGGAAAGAATTATAG 
C GAG C AT AAAC AG CAT CT CT T C G ACT T AAAAG AAGG AAT T T C T AAAC AT T 
T AT AT AAAAAT C ACGACTCT ATTTT AGAAT CTT AT ACAGGAAGCAT AACT 
AGT G AC C C AGAGGT T C CT GAG C AAT AC AAAGAT GAG AC AC GT AAT T T T AA 
ATTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATT 
TAAAAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGGGA 
AAG AGT G CT G GT C AAAAT G C CT TGT AT C AAT T T G AAGAAGGAG AG CGT C A 
AG TAG AT G CT AGT T T AT T AGAAAAAG CAT C T GT T T AC CAT AT T GC T GAT G 
AATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAACC 
GCCTGTTCTGCAAGTAATAATGCCGTAATATTAGGAACACAATTACTTCA 
AGATGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAGTG 
AT AT T T CT T T AGC AG G C T T C AC AT C AC TAG GAG CT AT T AAT AC AG AAAT G 
GCATGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGCGC 
TGGTTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCTAAATATGGAAAAA 
TTATCGGTGGTCTTATTACTTCAGATGGTTATCATATAACAGCACCTAAG 
C C AAC AGGT G AAGG G G C G GC AC AG AT T G C AAAG C AG C T AGT GAC T C AAG C 
AGGT AT T G ACT AC AGT GAG AT T G AC TAT AT T AAC GGT C ACG GT AC AGGT A 
CTCAAGCTAATGATAAAATGGAAAAAAATATGTATGGTAAGTTTTTCCCG 
AC AACGAC AT T GAT C AGC AGT AC C AAG G G G C AAAC GGGT CAT AC T CT AG G 
GG CT G C AG GT AT TAT C GAAT T GAT T AAT T GT T TAG CGG C AAT AG AGG AAC 
AGACTGTACCAGCAACTAAAAATGAGATTGGGATAGAAGGTTTTCCAGAA 
AAT T T T GT C TAT CAT C AAAAGAG AG AAT AC C C AAT AAG AAAT GC T T T AAA 
TTTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTGTCTTATTGTCATCTT 
TAG ATT CACCTCT AG AAAC ATT AC CTGCTAGAG AAAAT CTTAAAATGGCT 
ATCTTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATAAC 
CTATGAAAAAGTTGCTAGTAATTTCAACGACTTTGAAGCATTACGCTTTA 
AAGG GG C TAG AC C AC C C AAAAC TGT C AAC C C AG C AC AAT T T AGG AAAAT G 
GAT GAT T T T T C C AAAAT G GT T G C C G T AAC AAC AGC T C AAGC AC T AAT AG A 
AAGC AAT AT T AAT C T AAAAAAAC AAG AT AC T T C AAAAGT AGG AAT T G T AT 
TTACAACACTTTCTGGACCAGTTGAGGTTGTTGAAGGTATTGAAAAGCAA 
AT C AC AAC AG AAGG AT AT GC AC AT GTTTCTGCTT C ACG ATT C CC GT T T AC 
AGT AAT GAAT G C AG C AG CT GGT AT G C T T T C TAT CAT T T T T AAAAT AAC AG 
GTC CT T T AT CT GT CAT T T CG AC AAAT AGT GG AG C G C T T GAT GGT AT AC AA 
TAT G C C AAGG AAAT GAT G C GT AAC GAT AAT CT AG AC TAT GT GAT T CT T GT 
T T CT G C T AAT C AGT G GAC AG AC AT G AGT T T T AT GT G GT G G C AAC AAT T AA 
ACT AT GAT AGT C AAAT GTTTGTCGGTTCT GAT TAT T GT T C AG C AC AAGT C 
CTCTCTCGT C AAG CAT T G GAT AAT T C T C C TAT AAT AT TAG GT AGT AAAC A 
AT T AAAAT AT AGC CAT AAAAC AT T C AC AG AT GT GAT G AC T AT TT T T GAT G 
CTGCGCTT C AAAAT T TAT TAT C AG AC T T AGG ACT AAC CAT AAAAG AT AT C 
AAAG GT T T C G T T T G GAAT G AGC G G AAG AAGG C AGT T AGT T C AGAT TAT G A 
TTTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTG 
GTCAGTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTAtaCT 
GT T AAT G AAAGT AT AG AAAAGG G C T AT TAT T T AGT C CT AT C T TAT T C GAT 
CTTCGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7507 

STRAIN COH1 

AT GT T AGT G GAAT AG GAAT TAT TTCTTCTT T GGG AAAG AAT TAT AGC 

GAG CAT AAAC AGC AT CT CT T C GAC T T AAAAG AAG GAAT T T C T AAAC AT T T 

ATATAAAAATCACGACTCTATTTTAGAATCTTATACAGGAAGCATAACTA 

G T GAC C C AG AGGT T C CT GAG C AAT AC AAAG AT GAG AC AC GT AAT T T T AAA 

TTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATTT 

AAAAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGGGAA 

AGAGT G C T GGT C AAAAT G C C T T GT AT C AAT T T G AAG AAG GAG AG CGT C AA 

GTAGATGCTAGTTTATTAGAAAAAGCATCTGTTTACCATATTGCTGATGA 

ATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAACCG 

CCTGTTCTGCAAGTAATAATGCCGTAATATTAGGAACACAATTACTTCAA 

GATGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAGTGA 

TAT T T C T T TAG C AG G CT T C AC AT C AC T AGG AG C TAT T AAT AC AG AAAT G G 

CATGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGCGCT 

GGTTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCTAAATATGGAAAAAT 

TATCGGTGGTCTTATTACTTCAGATGGTTATCATATAACAGCACCTAAGC 

C AAC AG GT G AAG G GG C G G C AC AG AT T G C AAAG C AG CT AGT G ACT C AAG C A 

GGTATTGACTACAGTGAGATTGACTATATTAACGGTCACGGTACAGGTAC 
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T C AAG CT AAT GAT AAAAT G G AAAAAAAT AT G T AT G GT AAGT T T T T C C C GA 
C AACGAC AT T GAT C AG C AGT AC C AAGGG G C AAAC GGGT CAT AC T CT AGG G 
G C T G C AG GT AT TAT C GAAT T GAT T AAT T GT T TAG C G G C AAT AGAGG AAC A 
GACTGTACCAGCAACTAAAAATGAGATTGGGATAGAAGGTTTTCCAGAAA 
ATT T T GT CT AT CAT CAAAAG AGAGAAT AC C C AAT AAG AAAT GC T T T AAAT 
TTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTGTCTTATTGTCATCTTT 
AG AT T C ACCT CT AG AAAC AT T AC CT GC TAGAG AAAAT CT T AAAAT G G CT A 
TCTTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATAACC 
TAT GAAAAAGT T GCT AGT AAT T T C AAC GACT T T GAAG CAT T AC G C T T T AA 
AG G G G CT AGACC ACC C AAAAC T GT C AAC C C AG C AC AAT T TAG G AAAAT GG 
AT GAT T T T T C C AAAAT G GT T G C C GT AAC AAC AGC T C AAG C ACT AAT AG AA 
AG C AAT AT T AAT CT AAAAAAAC AAGAT AC T T C AAAAGT AG GAAT T GT AT T 
T AC AAC AC T T T CT GG AC C AGT T GAG GT T GT T G AAGGT AT T G AAAAG C AAA 
T C AC AAC AGAAGGAT AT G C AC AT GT T T C T G C T T C AC GAT T C C C GT T T AC A 
G T AAT GAAT G C AG C AG C T GGT AT GCT T T CT AT CAT T T T T AAAAT AAC AGG 
TCCTTTATCTGTCATTTCGACAAATAGTGGAGCGCTTGATGGTATACAAT 
ATGCCAAGGAAATGATGCGTAACGATAATCTAGACTATGTGATTCTTGTT 
T C T G C T AAT C AGT GG AC AGAC AT G AGT T T TAT GT GGT G G C AAC AAT T AAA 
CTATGATAGTCAAATGTTTGTCGGTTCTGATTATTGTTCAGCACAAGTCC 
TCTCTCGTCAAGCATTGGATAATTCTCCTATAATATTAGGTAGTAAACAA 
T T AAAAT AT AG C CAT AAAAC AT T C AC AGAT GT G AT G AC TAT T T T T GAT GC 
T G CG CT T C AAAAT T TAT TAT C AGACT T AGG AC T AAC CAT AAAAG AT AT C A 
AAGGT TTCGTTTG GAAT GAG C G GAAG AAG G C AG T T AGT T C AG AT TAT GAT 
TTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTGG 
TCAGTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTATACTG 
T T AAT GAAAGT AT AG AAAAG GG C TAT TAT T T AGT C C T AT CT T AT T C GAT C 
TTCGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7508 

STRAIN M781 

AT G T T AGT GG AAT AG GAAT TAT TTCTTCTTT G GG AAAG AAT TAT AG C 
GAG CAT AAAC AG CAT C T C T T C G AC T T AAAAG AAGG AAT T T C T AAAC AT T T 
ATATAAAAATCACGACTCTATTTTAGAATCTTATACAGGAAGCATAACTA 
G T G AC C C AG AG GT T C C T GAG C AAT AC AAAG AT GAG AC ACGT AAT T T T AAA 
TTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATTT 
AAAAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGGGAA 
AG AGT GC T GGT C AAAAT GC C T T GT AT C AAT T T GAAG AAG GAG AGC GT C AA 
G TAG AT G CT AGT T TAT T AG AAAAAG CAT C T GT T T ACC AT AT T G C T GAT G A 
ATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAACCG 
CCTGTTCTG CAAGT AAT AAT G C C G T AAT AT T AGG AAC AC AAT T ACT T C AA 
GATGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAGTGA 
TATTTCTTTAGCAGGCTTCACATCACTAGGAGCTATTAATACAGAAATGG 
CAT GTCAGCCCTATTCTTCTGGAAAAGGAATCAATTT GGGT GAGGGCGCT 
GGT TTTGTTGTTCTTGTCAAAGATCAGTCCT TAG CT AAAT ATGGAAAAAT 
TATCGGTGGTCTTATTACTTCAGATGGTTATCATATAACAGCACCTAAGC 
C AAC AGGT GAAGGGGC G G C AC AG AT T G C AAAG C AGC T AGT G AC T C AAG C A 
GG T AT T GAC T AC AGT GAG AT T G AC TAT AT T AAT GGT C ACGGT AC AGGT AC 
TCAAGCTAATGATAAAATGGAAAAAAATATGTATGGTAAGTTTTTCCCGA 
C AACGAC AT T GAT C AG C AGT AC C AAG GG G C AAAC GGGT CAT AC T CT AGG G 
GC T G C AGGT AT TAT C GAAT T GAT T AAT T GT T T AG CGG C AAT AG AGG AAC A 
GAC T GT AC C AG C AAC T AAAAAT GAG AT T GG GAT AG AAGGT T T T C C AG AAA 
AT T T T GT C T AT CAT CAAAAG AG AG AAT AC C C AAT AAG AAAT GCT T T AAAT 
TTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTATCTTATTGTCATCTTT 
AGAT T C AC C T C TAG AAAC AT T AC C T G C T AG AGAAAAT C T T AAAAT G G CT A 
TCTTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATAACC 
TATGAAAAAGTTGCTAGTAATTTCAACGACTTTGAAGCATTACGCTTTAA 
AGGGGCTAGACCACCCAAAACTGTCAACCCAGCACAATTTAGGAAAATGG 
AT GAT T T T T C C AAAAT GGTTGCCG T AAC AAC AG C T C AAG C ACT AAT AG AA 
AG C AAT ATT AAT CT AAAAAAAC AAGAT ACT T C AAAAGT AGG AATTGT ATT 
TACAACACTTTCTGGACCAGTTGAGGTTGTTGAAGGTATTGAAAAGCAAA 
T C AC AAC AG AAGG AT AT G C AC AT GTTTCTGCTT C AC GAT T C C C GT T T AC A 
GTAATGAATGCAGCAGCTGGTATGCTTTCTATCATTTTTAAAATAACAGG 
TCCTTTATCTGTCATTTCGACAAATAGTGGAGCGCTTGATGGTATACAAT 
AT GC C AAGG AAAT GAT G C GT AAC GAT AAT C TAG AC T AT G T GAT T C T T GT T 
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TCTGCTAATCAGTGGACAGACATGAGTTTTATGTGGTGGCAACAATTAAA 
CTATGATAGTCAAATGTTTGTCGGTTCTGATTATTGTTCAGCACAAGTCC 
T CT CTCGT C AAGCATTGG ATAATT CT C CT AT AAT AT T AGGT AGT AAACAA 
T T AAAAT ATAGCC AT AAAACATT CACAGATGT GAT GACTATTTT T GATGC 
TGCGCT T CAAAAT T TAT TATC AGACTTAGGACT AACC ATAAAAGATAT CA 
AAGG T TTCGTTTG G AAT GAG C GGAAGAAGGC AGT T AGT T GAG AT TAT GAT 
TTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTGG 
TCAGTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTATACTG 
TTAATGAAAGTATAGAAAAGGGCTATTATTTAGTCCTATCTTATTCGATC 
TTTGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7509 

STRAIN CJB110 

ATGTTAGTGGAATAGGAATTATTTCTTCTTTGGGAAAGAATTATAGC 
GAGC ATAAACAGCAT CT CT T CGACT T AAAAGAAGG AATTT CT AAACAT T T 
ATATAAAAATCACGACTCTATTTTAGAATCTTATACAGGAAGCATAACTA 
GT G AC C C AGAG GT T C C T G AGC AAT AC AAAG AT GAG AC AC G T AAT T T T AAA 
TTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATTT 
AAAAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGGGAA 
AGAGTGCTGGT CAAAAT GCCTTGT AT CAATTTGAAGAAGGAGAGCGTCAA 
GTAGATGCTAGTTTATTAGAAAAAGCATCTGTTTACCATATTGCTGATGA 
ATTGATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAACCG 
CCTGTTCTGCAAGTAATAATGCCGTAATATTAGGAACACAATTACTTCAA 
GATGGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAGTGA 
TAT T T C T T T AGC AGG C T T C AC AT C AC T AGG AG C T AT T AAT AC AG AAAT G G 
CATGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGCGCT 
GGTTTTGTTGTTCTTGTCAAAGATCAGTCCTT AGC T AAAT ATGGAAAAAT 
TATCGGTGGTCTTATTACTTCAGATGGTTATCATATAACAGCACCTAAGC 
CAACAGGTGAAGGGGCGGCACAGATTGCAAAGCAGCTAGTGACTCAAGCA 
GGT AT T G AC T AC AGT GAG AT T G ACT AT AT T AAT G GT C AC G GT AC AG G T AC 
T C AAG C T AAT GAT AAAAT G G AAAAAAAT AT G T AT GGT AAGT T T T T C C C GA 
C AAC G AC AT T GAT C AGC AGT AC C AAG G GGC AAAC G G GT CAT AC T C TAG G G 
G C T G C AGGT AT TAT C G AAT T GAT T AAT T GT T T AG C G G C AAT AGAG G AAC A 
G AC T GT AC C AG C AAC T AAAAAT GAG AT T GGG AT AG AAG GT T T T C C AG AAA 
ATT TT GT CT AT CAT C AAAAG AGAG AAT AC CCAAT AAG AAAT GCTT T AAAT 
TTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTATCTTATTGTCATCTTT 
AGATTCACCTCTAGAAACATTACCTGCTAGAGAAAATCTTAAAATGGCTA 
TCTTATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATAACC 
TAT G AAAAAG T T G C T AGT AAT T T C AACGAC T T T GAAGC AT T AC G CT T T AA 
AGGGGCTAGACCACCCAAAACTGTCAACCCAGCACAATTTAGGAAAATGG 
ATGATTTTTCCAAAATGGTTGCCGTAACAACAGCTCAAGCACTAATAGAA 
AG C AAT AT T AAT C T AAAAAAAC AAGAT AC T T C AAAAGT AGG AAT T GT AT T 
TACAACACTTTCTGGACCAGTTGAGGTTGTTGAAGGTATTGAAAAGCAAA 
T C AC AAC AG AAGG AT AT GC AC AT GT T T C T G CT T C AC GAT T C CC G T T T AC A 
GTAATGAATGCAGCAGCTGGTATGCTTTCTATCATTTTTAAAATAACAGG 
T C CT T T AT CT GT CAT T T CG AC AAAT AG T G GAG C G CT T GAT GGT AT AC AAT 
AT G C C AAGG AAAT GAT G CG T AAC GAT AAT CT AG AC TAT GT G AT T C T T GT T 
T C T G CT AAT C AGT GG AC AG AC AT G AGT T T T AT GT GGT GG C AAC AAT T AAA 
C T AT GAT AGT C AAAT GTTTGTCGGTTCT GAT TAT T GT T C AG C AC AAGT C C 
T CT CTCGTCAAGCATTGGAT AAT TCTCCT AT AAT ATT AGGT AGT AAACAA 
T T AAAAT AT AG C C AT AAAAC AT T C AC AG AT GT G AT G ACT AT T T T T GAT G C 
T G CG C T T CAAAAT T TAT TAT C AGAC T T AG G ACT AAC CAT AAAAG AT AT C A 
AAGGT TTCGTTTG G AAT GAG C G G AAGAAG G C AGT T AGT T C AG AT TAT GAT 
TTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTGG 
T C AGT T T G GAT T T T CAT C T AAT GGT G C T G GT G AAG AAC T GG AC TAT ACT G 
TTAATGAAAGTATAGAAAAGGGCTATTATTTAGTCCTATCTTATTCGATC 
TTTGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7510 

STRAIN 1169NT 

AT GT T AGT GG AAT AGG AAT TAT TTCTTCTTTGG G AAAG AAT T AT AG 
CGAGCAT AAACAGCAT CT CTT CGACT T AAAAG AAGG AATT T CT AAAC ATT 
TATATAAAAATCACGACTCTATTTTAGAATCTTATACAGGAAGCATAACT 
AGTGACCCAGAGGTTCCTGAGCAATACAAAGATGAGACACGTAATTTTAA 
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ATTTGCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATT 
TAAAAGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGGGA 
AAGAGTGCTGGTCAAAATGCCTTGTATCAATTTGAAGAAGGAGAGCGTCA 
AGT AG AT G C T AGT T TAT T AG AAAAAG CAT CT GT T T AC CAT AT T G C T GAT G 
AAT TG AT GGC T TAT CAT GAT AT T GT G G GAG C T T CGT AT GT TAT T T C AAC C 
G C CT GT T CT GC AAGT AAT AAT G C C GT AAT AT T AGG AAC AC AAT T ACT T C A 
AG AT GG C GAT T GT GAT T T AGC T AT T TGT G GT GG CT GT GAT G AGT T AAGT G 
ATATTTCTTTAGCAGGCTTCACATCACTAGGAGCTATTAATACAGAAATG 
GCATGTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGCGC 
TGGTTTTGTTGTTCTTGTCAAAGATCAGTCCTTAGCTAAATATGGAAAAA 
T TAT CGGTGGTCT T AT T AC T T C AG AT G GT T AT CAT AT AAC AG C AC CT AAG 
CCAACAGGTGAAGGGGCGGCACAGATTGCAAAGCAGCTAGTGACTCAAGC 
AGG TAT T GAC TAG AG T GAG AT T G AC TAT AT T AACG GT C AC GGT AC AGGT A 
CTCAAGCTAATGATAAAATGGAAAAAAATATGTATGGTAAGTTTTTCCCG 
AC AAC GAC AT T GAT C AG C AGT ACC AAGGG G C AAAC G G G T CAT AC T C T AGG 
GGCTGCAGGTATTATCGAATTGATTAATTGTTTAGCGGCAATAGAGGAAC 
AGACT GT AC C AG C AAC T AAAAAT GAGATT GGGAT AGAAGGTTT T CC AGAA 
AAT T T T GT CT AT CAT C AAAAG AG AGAAT AC C C AAT AAG AAAT G CT T T AAA 
TTTTTCGTTTGCTTTTGGTGGAAATAATAGTGGTATCTTATTGTCATCTT 
TAGATTCACCTCTAGAAACATTACCTGCTAGAGAAAATCTTAAAATGGCT 
AT C T T AT CAT CTGTTGCTTC CAT T T C T AAG AAT G AAT C ACTT T CT AT AAC 
CTATGAAAAAGTTGCTAGTAATTTCAACGACTTTGAAGCATTACGCTTTA 
AAGG G G C TAG AC C AC C C AAAAC T GT C AAC C C AG C AC AAT T T AGG AAAAT G 
GAT GAT T T T T C C AAAAT GGT T G C C GT AAC AAC AG C T C AAG C ACT AAT AGA 
AAGCAATATTAAT CT AAAAAAACAAGAT ACTT CAAAAGTAGGAATT GTAT 
T T AC AAC ACT T T CT GGAC C AGT T GAGGT T GT TG AAGGT AT T GAAAAGC AA 
AT C AC AAC AGAAGG AT AT G C AC AT GTTTCTGCTT C AC GAT T C C CGT T T AC 
AG T AAT GAAT GC AG C AG C T G G T AT G C T T T C TAT CAT T T T T AAAAT AAC AG 
GT C C T T TAT C T G T CAT T T C GAC AAAT AGT GG AG C G CT T GAT G G T AT AC AA 
TAT G C C AAGG AAAT GAT GC GT AAC GAT AAT CT AG AC TAT GT GAT T C T T GT 
TTCTGCTAATCAGTGGACAGACATGAGTTTTATGTGGTGGCAACAATTAA 
ACTATGATAGTCAAATGTTTGTCGGTTCTGATTATTGTTCAGCACAAGTC 
CTCTCTCGT C AAGC AT T G GAT AAT T C T CC TAT AAT AT T AG GT AG T AAAC A 
AT T AAAAT AT AG C CAT AAAAC AT T C AC AGAT G T GAT GAC TAT T T T T GAT G 
CTGCGCTT C AAAAT T TAT TAT C AG AC T T AG G ACT AAC CAT AAAAG AT AT C 
AAAG GTTTCGTTTG GAAT GAG C GG AAG AAGG C AGT TAG T T C AGAT TAT G A 
TTTCTTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTG 
GTCAGTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTATACT 
GTTAATGAAAGTATAGAAAAGGGCTATTATTTAGTCCTATCTTATTCGAT 
CTTTGGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7511 

STRAIN JM9130013 

ATGTTAGTGGAATAGGAATTATTTCTTCTTTGGGAAAGAATTATAGCGAG 
CATAAACAGCATCTCTTCGACTTAAAAGAAGGAATTTCTAAACATTTATA 
T AAAAAT CACGACTCTATTTTAGAATCTT AT ACAGGAAGCATAACTAGTG 
AC C C AG AG GT T C CT G AG C AAT AC AAAG AT G AGAC AC G T AAT T T T&fyAT T T 
GCTTTTACCGCTTTTGAAGAGGCTCTTGCTTCTTCAGGTGTTAATTTAAA 
AGCTTATCATAATATTGCTGTGTGTTTAGGGACCTCACTTGGGGGAAAGA 
GTGCTGGT CAAAATGCCTTGT AT C AATTT GAAGAAGGAGAGCGT C AAGT A 
GAT G C TAG T T TAT TAG AAAAAG CAT C T GT T T AC CAT AT T G CT GAT GAAT T 
GATGGCTTATCATGATATTGTGGGAGCTTCGTATGTTATTTCAACCGCCT 
GTT CT GCAAGT AATAAT GC CGT AAT ATT AGGAACAC AATT ACT T CAAGAT 
GGCGATTGTGATTTAGCTATTTGTGGTGGCTGTGATGAGTTAAGTGATAT 
T T CT T T AG C AGG CT T C AC AT C AC T AGG AG CT AT T AAT AC AG AAAT G G CAT 
GTCAGCCCTATTCTTCTGGAAAAGGAATCAATTTGGGTGAGGGCGCTGGT 
TTTGTTGTTCTTGTCAAAGATCAGTC CTT AG CT AAAT ATGG AAAAAT TAT 
CGGTGGTCTTATTACTTCAGATGGTTATCATATAACAGCACCTAAGCCAA 
CAGGTGAAGGGGCGGCACAGATTGCAAAGCAGCTAGTGACTCAAGCAGGT 
ATTGACTACAGTGAGATTGACTATATTAACGGTCACGGTACAGGTACTCA 
AGCTAATGATAAAATGGAAAAAAATATGTATGGTAAGTTTTTCCCGACAA 
CG AC AT T GAT C AGC AGT AC C AAG G G G C AAAC GG G T CAT AC T C TAG G GG C T 
G C AGGT AT TAT C GAAT T G AT T AAT TGT T T AG CG G C AAT AGAG G AAC AG AC 
T GT AC C AG C AAC T AAAAAT GAG AT T G GG AT AG AAG GT T T T C C AG AAAAT T 
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T T GT CT AT CAT C AAAAG AG AG AAT AC C C AAT AAG AAAT GC T T T AAAT TT T 
TCGTTTGCTTTTGGTGGAAATAATAGTGGTGTCTTATTGTCATCTTTAGA 
TTCACCTCTAGAAACATTACCTGCTAGAGAAAATCTTAAAATGGCTATCT 
TATCATCTGTTGCTTCCATTTCTAAGAATGAATCACTTTCTATAACCTAT 
G AAAAAG T T G C T AGT AAT T T C AAC GAC T T T GAAG CAT T AC GCT T T AAAG G 
GG CT AGACCAC C C AAAAC T G T C AAC C C AG C AC AAT T T AGG AAAAT G GAT G 
ATTTTTCCAAAATGGTTGCCGTAACAACAGCTCAAGCACTAATAGAAAGC 
AATATTAATCTAAAAAAACAAGATACTTCAAAAGTAGGAATTGTATTTAC 
AAC AC T T T C T G G ACC AGT T G AG GT T GT T GAAG G T AT T G AAAAG C AAAT C A 
C AAC AG AAG GAT AT G C AC AT GTTTCTGCTT C AC GAT T C C CGT T TAG AGT A 
ATGAATGCAGCAGCTGGTATGCTTTCT AT CAT TTTT AAAAT AAC AGGTCC 
TTTATCTGTCATTTCGACAAATAGTGGAGCGCTTGATGGTATACAATATG 
C C AAGG AAAT GAT GCGT AAC GAT AAT CT AG AC TAT GT GAT T CT T GT T T CT 
GCT AAT C AGT G GAC AG AC AT G AGT T T T AT GT G G T GGC AAC AAT T AAAC T A 
T GAT AGT C AAAT GTTTGTCGGTTCT GAT TAT T GT T C AG C AC AAGT C CT CT 
CTCGTCAAGCATTGGATAATTCTCCTATAATATTAGGTAGTAAACAATTA 
AAAT AT AG C CAT AAAAC AT T C AC AG AT GT GAT GAC TAT TTTT GAT G CT G C 
GCTTCAAAATTTATTATCAGACTTAGGACTAACCATAAAAGATATCAAAG 
GT T T C GT T T GG AAT GAGC G G AAGAAGG C AGT T AGT T C AG AT TAT GAT T T C 
TTAGCGAACTTGTCTGAGTATTATAATATGCCAAACCTTGCTTCTGGTCA 
GTTTGGATTTTCATCTAATGGTGCTGGTGAAGAACTGGACTATACTGTTA 
AT G AAAGT AT AG AAAAG GG C TAT TAT T T AGT C C T AT C T TAT T C GAT CT T C 
GGTGGTATCTCTTTTGCTATTATTGAAAAAAGG 

SEQ ID NO. 7512 
STRAIN 2 603 frame: 1 

MSVYVSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQ 
YKDETRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQV 
DAS LLEKAS VYHI ADELMAYHD I VGAS YVI S TAG S ASNNAVI LGTQLLQDGDCDLAI CGG 
CDELSDIS LAG FT S LG AI N T E MAC Q P Y S S GKG I N LGE GAG FV V L VK D Q S L AK YGK 1 1 G G L 
ITSDGYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKF 
FPTTTLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKR 
EYPIRNALNFSFAFGGNNSGVLLSSLDSPLETLPARENLKMAILSSVASISKNESLSITY 
E KVA SN FN D FE ALR FKG AR P P KT VN P AQ FRKM D D F S KM VAVT T AQ AL IESNINLKKQDTS 
KVGIVFTTLSGPVEVVEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSV 
ISTNSGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSA 
QVLSRQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNER 
KKAVSSDYDFLANLSEYYNMPNLASGQFGFSSNGAGEELDYTWESIEKGYYLVLSYSIF 
GGISFAIIEKR 

SEQ ID NO. 7513 

STRAIN 090 frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKAS VYHI ADELMAYHDI VGAS YVI STAGS ASNNAVILGTQLLQDGDCDLAICGGCDEL 
S D I S LAG FT S LGAINTEMACQP YS S GKGINLGEGAGFVVLVKDQS LAKYGKI I GGL I T S D 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
RNALNFSFAFGGNNSGILLSSLDSPLETLPARENLKMAILSSVASISKNESLSITYEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEVVEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
S S D YDFLANLSE YYNMPNLASGQFGFS SNGAGEELD YT VNE S IEKGYYLVLS YS I FGGIS 
FAIIEKR 

SEQ ID NO. 7514 

STRAIN A90 9 frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKAS VYHI ADELMAYHDI VGAS YVI STACS ASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFVVLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQ I AKQLVTQAGI DYSE I DYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
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RNALNFSFAFGGNNSGVLLSSLDSPLETLPARENLKMAILSSVASISKNESLSITYEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEWEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
SSDYDFLANLSEYYNMPNLASGQFGFSSNGAGEELDYTVNESIEKGYYLVLSYSIFGGIS 
FAIIEKR 

SEQ ID NO. 7515 

STRAIN H3 6B frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDIVGASYVISTACSASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFWLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
RNALNFSFAFGGNNSGVLLSSLDSPLETLPARENLKMAILSSVASISKNESLSITYEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEWEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
SSDYDFLANLSEYYNMPNLASGQFGFS SNGAGEELDYTVNE S IEKGYYLVLS YS IFGGI S 
FAIIEKR 

SEQ ID NO. 7516 

STRAIN 18RS21 frame: 3 

VSGIGIISS LGKN Y S E HKQH L FD LKE G I S KHL YKNH D S I LE S YT GSITSDPEVPE Q YKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDIVGASYVISTACSASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFWLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
RNALNFS FAFGGNNSGVLLS SLD S PLETLPARENLKMAI LS S VAS I SKNE SLS IT YEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEWEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
SSDYDFLANLSEYYNMPNLASGQFGFSSNGAGEELDYTVNES IEKGYYLVLS YS I FGGIS 
FAIIEKR 

SEQ ID NO. 7517 

STRAIN M732 frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDIVGASYVISTACSASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFWLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
RNALNFS FAFGGNNSGVLLS SLD S PLETLPARENLKMAI LSSVAS I SKNE SLS IT YEKVA 
S N FN D FE AL R FKG AR P P KT VN P AQ FRKM D D F S KMV AVT T AQ AL I E SN I N LKKQ D T S KVG I 
VFTTLSGPVEWEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQ AL DN SPIILGSKQLKYS HKT FT D VMT I F D AAL QN LLSDLGLT-IKDI KG F VWN E RKKAV 
S SDYDFLANLSEYYNMPNLASGQFGFS SNGAGEELDYTVNE S IEKGYYLVLS YS I FGGI S 
FAIIEKR 

SEQ ID NO. 7518 

STRAIN COH1 frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDIVGASYVI STAGS ASNNAVILGTQLLQDGDCDLAICGGCDEL 
S DI SLAGFT SLGAINTEMACQPYS SGKGINLGEGAGFVVLVKDQSLAKYGKI IGGLITS D 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
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RNALNFS FAFGGNNSGVLLS SLDS PLETLPARENLKMAILSS VAS I SKNESLS ITYEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEWEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
SSDYDFLANLSEYYNMPNLASGQFGFSSNGAGEELDYTVNESIEKGYYLVLSYSIFGGIS 
FAIIEKR 

SEQ ID NO. 7519 

STRAIN M7 81 frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDIVGASYVISTACSASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFVVLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
RNALNFSFAFGGNNSGILLSSLDSPLETLPARENLKMAILSSVAS I SKNESLS ITYEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEWEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
S SDYDFLANLSE YYNMPNLASGQFGFS SNGAGEELDYTVNES IEKGYYLVLSYS I FGGI S 
FAIIEKR 

SEQ ID NO. 7520 

STRAIN CJB110 frame: 3 

VSGIGIISSLGKNYSEHKQHLFDLKEGISKHLYKNHDSILESYTGSITSDPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDIVGASYVISTACSASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFWLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
RNALNFSFAFGGNNS GILLS SLDSPLETLPARENLKMAILSSVAS I SKNESLS ITYEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEWEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLS 1 1 FKITGPLS VI STN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
SSDYDFLANLSEYYNMPNLASGQFGFS SNGAGEELDYTVNES IEKGYYLVLSYS I FGGI S 
FAIIEKR 

SEQ ID NO. 7521 

STRAIN 1169NT frame: 3 

VSGIGI I S SLGKNYSEHKQHLFDLKEGI SKHLYKNHDS ILE S YTGS I TS DPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDIVGASYVI STAGS ASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFWLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
RNALNFSFAFGGNNSGILLS SLDSPLETLPARENLKMAILSSVAS I SKNESLS ITYEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEWEGIEKQITTEGYAHVS ASRFPFTVMNAAAGMLS 1 1 FKITGPLS VI STN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKKAV 
SSDYDFLANLSEYYNMPNLASGQFGFSSNGAGEELDYTVNESIEKGYYLVLSYSIFGGIS 
FAIIEKR 

SEQ ID NO. 7522 

STRAIN JM9130013 frame: 3 

VSGIGI IS SLGKNYSEHKQHLFDLKEGI SKHLYKNHDS I LESYTGS ITS DPEVPEQYKDE 
TRNFKFAFTAFEEALASSGVNLKAYHNIAVCLGTSLGGKSAGQNALYQFEEGERQVDASL 
LEKASVYHIADELMAYHDIVGASYVI STACSASNNAVILGTQLLQDGDCDLAICGGCDEL 
SDISLAGFTSLGAINTEMACQPYSSGKGINLGEGAGFVVLVKDQSLAKYGKIIGGLITSD 
GYHITAPKPTGEGAAQIAKQLVTQAGIDYSEIDYINGHGTGTQANDKMEKNMYGKFFPTT 
TLISSTKGQTGHTLGAAGIIELINCLAAIEEQTVPATKNEIGIEGFPENFVYHQKREYPI 
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RNALNFSFAFGGNNSGVLLSSLDSPLETLPARENLKMAILSSVASISKNESLSITYEKVA 
SNFNDFEALRFKGARPPKTVNPAQFRKMDDFSKMVAVTTAQALIESNINLKKQDTSKVGI 
VFTTLSGPVEWEGIEKQITTEGYAHVSASRFPFTVMNAAAGMLSIIFKITGPLSVISTN 
SGALDGIQYAKEMMRNDNLDYVILVSANQWTDMSFMWWQQLNYDSQMFVGSDYCSAQVLS 
RQALDNSPIILGSKQLKYSHKTFTDVMTIFDAALQNLLSDLGLTIKDIKGFVWNERKBCAV 
SSDYDFLANLSE YYNMPNLASGQFGFS SNGAGEELDYT VNES IEKGYYLVLS YS I FGGI S 
FAIIEKR 

SEQ ID NO. 7601 
STRAIN 2603 

AT G AAAAAAGT CAT C GAT T T AAAAAAACT AC AAAAAG CAT AT G C CT C AG AAAC C GT T T T A 
AAT AAT AT T AAT T T G GAG G T GT T T AAAGG CGAAAT AAT T G G AT T AAT AG G AC C C T C T GGA 
GCAGGGAAATCTACCTTGATTAAAACTATGCTTGGCATGGAAAAAGCAGATAAGGGAACA 
GCTCTTGTTCTTGATACTCAAATGCCAGATCGTAATATTTTAAATCAAATTGGCTATATG 
GCTCAATCTGATGCCTTATACGAGTCTTTAACTGGCTTAGAAAATTTATTATTCTTTGGA 
AAAATGAAAGGT ATT CAAAAAACTGAAT TAAAACAGCAGAT AACT CATAT TT CT AAAGT A 
GT AGAT CTAGAAAACC AACT T GAT AAAT TT GT CT CAGGT TACT CAGG AGGTATGAAAAGA 
CGGCTTTCTCTAGCCATCGCCCTACTTGGAAACCCCACAGTTTTAATCCTAGATGAACCT 
AC C GT T G GAAT T GAT C CAT C C T T GAG G AGAAAAAT CT G G C AAGAG CT AAT T AAT AT T AAG 
GAT GAAGGAC ATT CT AT CTT T ATTACAACCCACGTTAT GGAT GAAGC AGAATT AACAAGT 
AAGGTTGCACTACTATTACGTGGAAACATTATTGCCTTTGATACTCCATTACATTTAAAA 
AAAC AAT T T AAT GT G AG T AC T ATT G AGG AAGTT T T CT T AAAAG C T GAAG G AG AA 

SEQ ID NO. 7602 

STRAIN 090 

AT T TAAAAAAAC T AC AAAAAG CAT AT G C CT C AG AAACT GT T T T AAAT AAT 
AT T AAT T T G GAG G T G T T T AAAGG C G AAAT AAT T GGAT T AAT AGG AC C CT C 
TGGAGCAGGGAAATCTACCTTGATTAAAACTATGCTTGGCATGGAAAAAG 
C AG AT AAG GG AAC AG CTCTTGTTCTT GAT ACT C AAAT G C C AG AT C G T AAT 
AT T T T AAAT C AAAT T GGCT AT AT GG C T C AAT C T GAT G C CT T AT AC GAAT C 
TTTAACTGCCTTAGAAaATTTATTATTCTTTGGAAAAATGAAAGGTATTC 
AAAAAACT G AAT TAAAACAGCAGAT AAC T CATAT T T c T AAAGT AGT AGAT 
CTAGAAAACCAACTTGATAAATTTGTCTCAGGTTACTCAGGAGGTATGAA 
AAGACGGCTTTCTCTAGCCATCGCCCTACTTGGAAACCCCACAGTTTTAA 
T C C TAG AT G AAC CT AC C GT T G GAAT T GAT C CAT CC T T GAG G AG AAAAAT C 
T GG C AAGAG CT AAT T AAT AT T Aa G GAT G AAGG AC GT T CT AT CTT TAT T AC 
AACCCACGTTATGGATGAAGCAGAATTAACAAGTAAGGTTGCACTACTAT 
T ACGT GG AAAC AT TAT T GC CT T T GAT ACT C CAT T AC AT T TAAAAAAAC AA 
T T T AAT GT G AGT AC T AT t G AGG AAG T T T T C T T AAAAG C T GAAGG AG AA 

SEQ ID NO. 7 603 

STRAIN A909 ( 
AAAAAAGT CAT C GAT T TAAAAAAAC T AC AAAAAG C AT AT GCCT C A 
GAAACCGT T TT AAAT AAT AT T AAT TTGGAGGT GTTTAAAGGCGAAAT AAT 
T G GAT T AAT AGG AC C C T C T G G AGC AG GG AAAT CT AC CTT GAT T AAAAC T A 
T G C T T GG CAT GGAAAAAG C AG AT AAG GG AAC AG CT CTTGTTCTT GAT ACT 
CAAATGCCAGAT CAT AAT AT T T T AAAT CAAAT T GG CT AT AT GGCT CAAT C 
TGATGCCTTATACGAGT CTT T AACT GGCT TAGAAAATTT ATT ATT CTTTG 
G AAAAAT GAAAGGT ATT CAAAAAACTGAAT TAAAACAGCAGAT AACT CAT 
AT T T C T AAAGT AGT AG AT C T AGAAAACC AAC T T GAT AAAT T T GT C T CAGG 
TTACTCAGGAGGTATGAAAAGACGGCTTTCTCTAGCCATCGCCCTACTTG 
GAAAC C C C AC AGT T T T AAT C C TAG AT G AAC C T ACC GT T G GAAT T GAT C C A 
T C C T T GAG GAG AAAAAT C T G G C AAGAG CT AAT T AAT AT T AAG GAT GAAG G 
ACGT T C TAT CTT TAT T AC AAC C C AC GT TAT GGAT GAAG C AG AAT T AAC AA 
GTAAGGTTGCACTACTATTACGTGGAAACATTATTGCCTTTGATACTCCA 
TTACATTTAAAAAAACAATTTAATGTGAGTACTATTGAGGAAGTTTTCTT 
AAAAGCTGAAGGAGAA 

SEQ ID NO. 7604 

STRAIN H3 6B 

AAAAAAGT CATTGAT T T AAAAAAACT AC AAAAAGC AT ATG CC 

T C AG AAAC C GT T T T AAAT AAT AT T AAT T T G G AGGT GT T T AAAG G C G AAAT 

AAT T G GAT T AAT AGG AC C C T C T GG AG C AG G GAAAT CT AC CTT GAT T AAAA 

CTATGCTTGGCATGGAAAAAGCAGATAAGGGAaCAGCTCTTGTTCTTGAT 
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AC T C AAAT G C C AG AT CGT AAT AT T T T AAAT C AAAT T G G C TAT AT GG C T C A 
ATCTGATGCCTTATACGAGTCTTTAACTGGCTTAGAAAATTTATTATTCT 
TTGGAAAAATGAAAGGTATTCAAAAAACTGAATTAAAACAGCAGATAACT 
CAT AT T T CT AAAGT AGT AG AT C T AG AAAAC C AAC T T GAT AAAT T T GT CT C 
AG GT T AC T C AG GAG GT AT G AAAAG ACGGC T T T C T C TAG C CAT CG C CC T AC 
T T G G AAAC C C C AC AGT T T T AAT C C TAG AT GAAC C T AC C GT T G G AAT T GAT 
C CAT C C T T GAG GAGAAAAAT C T GG C AAG AGC T AAT T AAT AT T AAGG AT GA 
AGGACGTTCTATCTTTATTACAACCCACGTTATGGATGAAGCAGAATTAA 
CAAGTAAGGTTGCACTACTATTACGTGGAAACATTATTGCCTTTGATACT 
CCATTACATTTAAAAAAACAATTTAAT'GTGAGTACTATTGAGGAAGTTTT 
C T T AAAAG CT G AAG G AG AA 

SKQ ID NO. 7605 

STRAIN 18RS21 

GAT T T AAAAAAAC T AC AAAAAG CAT AT G C C T C AG AAAC C GT T T T AAAT AA 
TATTAATTTGGAGGTGTTTAAAGGCGAAATAATTGGATTAATAGGACCCT 
CT GGAG C AG G GAAAT C T AC c T T GAT T AAAAC TAT G C T T G G CAT GG AAAAA 
G C AG AT AAGGGAAC AGC T CTTGTTCTT GAT AC T C AAAT G C C AG AT C GT AA 
T ATT T T AAAT C AAAT T G G C TAT AT G G CT C AAT c T GAT G C C T T AT ACG AGT 
CTTTAACTGGCTTAGAAAATTTATTATTCTTTGGAAAAATGAAAGGTATT 
CAAAAAACTGAATTAAAACAGCAGATAACTCATATTTCTAAAGTAGTAGA 
TCTAGAAAACCAACTTGATAAATTTGTCTCAGGTTACTCAGGAGGTATGA 
AAAGACGGCTTTCTcTAGCCATCGCCCTACTTGGAAACCCCACAGTTTTA 
AT C C TAG AT GAAC C T AC CGT T GGAAT T GAT C CAT C CT T G AGG AG AAAAAT 
C T G G C AAGAG CT AAT T AAT AT T Aa GG AT GAAG G AC AT T C T AT CT T TAT T A 
C AAC C C ACG T TAT GG AT GAAG C AGAAT T AAC AAGT AAG G T T G C AC T AC T A 
TTACGTGGAAACATTATTGCCTTTGATACTCCATTACATTTAAAAAAACA 
AT T T AAT GT G AGT ACT AT T GAG G AAGT T T T C T T AAAAG C T G AAGG AGAA 

SEQ ID NO. 7606 

STRAIN M732 

AAAAAAGT CAT C GAT T T AAAAAAAC T AC AAAAAG CAT AC G C CT C A 
G AAAC T G T T T T AAAT AAT AT T AAT T T G GAG G TGT T T AAAG G AGAAAT AAT 
T G GAT T AAT AGG AC C CT C T GG AG C AG GG AAAT CT AC C T T G AT T AAAAC T A 
TGCTTGGCATGGAAAAAGCAGAT AAGGGAAC AGCTCTTGTTCTTGATACT 
C AAAT G C C AG AT C GT AAT AT T T T AAAT C AAAT T GG C TAT AT GG C T C AAT C 
TGATGCCTTACACGAGTCTTTAACTGGCTTAGAAAATTTATTATTCTTTG 
GAAAAAT G AAAG GT AT T CAAAAAACT G AAT T AAAAC AG C AG AT AAC T CAT 
AT T T C T AAAGT AGT AG AT C TAG AAAAC C AAC T T GAT AAAT T T GT C T C AG G 
TTACTCAGGAGGTATGAAAAGACGGCTTTCTCTAGCCATCGCCCTACTTG 
G AAAC C C C AC AGT T T T AAT C C TAG AT GAAC C TAG CGT T GGAAT T GAT C C A 
T C C T T G AGG AG AAAAAT C T G GC AAG AGC T AAT T AAT AT T AAG GAT GAAG G 
AC GT T CT AT C T T TAT T AC AAC C C AC GT TAT G GAT G AAG C AG AAT T AAC AA 
GTAAGGTTGCACTACTATTACGTGGAAACATTATTGCCTTTGATACTCCA 
T T AC AT T T AAAAAAAC AAT T T AAT G T GAG TACT AT T G AGG AAG T T T T C T T 
AAAAG C T GAAG G AG AA 

SEQ ID NO. 7607 

STRAIN COH1 

AAAAAAGT CAT C GAT T T AAAAAAAC T AC AAAAAG CAT ACG C C T C AG AA 
AC T GT T T T AAAT AAT AT T AAT T T G G AGG T GT T T AAAGG AG AAAT AAT T G G 
AT T AAT AG G AC C C T C T GG AGC AG GG AAAT CT AC CT T GAT T AAAAC TAT G C 
T T G G CAT GG AAAAAG C AG AT AAG GG AAC AG CTCTTGTTCTT GAT AC T C AA 
AT G C C AG AT C GT AAT AT T T T AAAT C AAAT T GG C TAT AT G G C T C AAT CT G A 
TGCCTTACACGAGTCTTTAACTGGCTTAGAAAATTTATTATTCTTTGGAA 
AAAT G AAAGG TAT T C AAAAAAC T G AAT T AAAAC AG C AG AT AAC T CAT AT T 
T CT AAAGTAGT AGAT CT AGAAAAC CAACT T GAT AAATTT GT CT CAGGTT A 
CTCAGGAGGTATGAAAAGACGGCTTTCTCTAGCCATCGCCCTACTTGGAA 
ACCCCACAGTTTTAATCCTAGATGAACCTACCGTTGGAATTGATCCATCC 
T T G AGG AG AAAAAT C T GG C AAG AG C T AAT T AAT AT T AAG GAT GAAG G AC G 
T T C TAT C T T T AT T AC AAC C C AC GT TAT GG AT GAAG C AG AAT T AAC AAG T A 
AGGTTGCACTACTATTACGTGGAAACATTATTGCCTTTGATACTCCATTA 
CAT T T AAAAAAAC AAT T T AAT GT G AGT AC TAT T GAG GAAG 
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SEQ ID NO. 7608 

STRAIN M781 

AAAAAAGT CAT C GAT T T AAAAAAAC T AC AAAAAG CAT AC 
GCCTCAGAAACTGTTTTAAATAATATTAATTTGGAGGTGTTTAAAGGAGA 
AAT AAT T GG AT T AAT AGGAC C C T C T G GAGC AG G G AAAT CT AC C T T GAT T A 
AAAC TAT G C T T GGC AT G G AAAAAGC AG AT AAGGGAAC AG CTCTTGTTCTT 
GAT ACT C AAAT G C C AGAT C GT AAT AT T TT AAAT C AAAT T G G C TAT AT G GC 
TCAATCTGATGCCTTACACGAGTCTTTAACTGGCTTAGAAAATTTATTAT 
TCTTTGGAAAAATGAAAGGTATTCAAAAAACTGAATTAAAACAGCAGATA 
ACTCATATTTCTAAAGTAGTAGATCTAGAAAACCAACTTGATAAATTTGT 
C T C AGGT T AC T C AG G AGGT ATG AAAAG ACGGCT T T C T C T AG C CAT C G C CC 
TACT T GG AAAC C C C AC AGT T T T AAT C C TAG AT G AAC C TAG CGT T G G AAT T 
GAT CCAT CCTTGAGGAGAAAAAT CTGGCAAGAGCTAATT AAT AT TAAGGA 
T G AAGG AC GT T CT AT C T T T AT TAG AAC C C AC GT TAT GG AT G AAG C AG AAT 
TAACAAGTAAGGTTGCACTACTATTACGTGGAAACATTATTGCCTTTGAT 
AC T C CAT T AC AT T T AAAAAAAC AAT T T AAT G T G AGT ACT AT T G AGG AAGT 
T T T C T T AAAAG C T G AAGG AG AA 

SEQ ID NO. 7609 

STRAIN CJB110 

AAAAAAGT CAT C GAT T T AAAAAAAC T AC AAAAAG CAT AT G 
CCT C AGAAACT GT T T T AAAT AAT AT T AAT T T GG AG GT GT T T AAAG G C GAA 
AT AAT T GGAT T AAT AGG AC C C T C T GGAG C AGG GAAAT C T AC C T T G AT T AA 
AAC TAT G C T T GGC AT GG AAAAAG C AG AT AAGG G AAC AGC T CTTGTTCTTG 
AT AC T C AAAT GC C AGAT CGT AAT AT T T T AAAT C AAAT T G GC T AT AT G G C T 
CAATCTGATGCCTTATACGAATCTTTAACTGCCTTAGAAAATTTATTATT 
C T T T G G AAAAAT G AAAG GT AT T C AAAAAACT G AAT TAAAAC AGC AGAT AA 
C T CAT AT T T CT AAAG T AG T AGAT C T AGAAAAC C AACT T GAT AAAT T T G T C 
TCAGGTTACTCAGGAGGTATGAAAAGACGGCTTTCTCTAGCCATCGCCCT 
ACT T GG AAAC C C C AC AGT T T T AAT CCT AGAT GAAC C T AC C GT T GG AAT T G 
AT C CAT C CT T GAG GAG AAAAAT C T GG C AAG AG C T AAT T AAT AT T AAG GAT 
G AAGG AC GT T C T AT CT T TAT T AC AAC C C ACG T TAT GGAT G AAG C AG AAT T 
AAC AAG T AAG GT T G C AC T AC T AT T AC GT GG AAAC AT T AT T G C C T T T GAT A 
C T C CAT T AC AT T T AAAAAAAC AAT T T AAT GT G AG T AC T AT T GAGG AAGT T 
T T CT T AAAAG C T G AAGG AG AA 

SEQ ID NO. 7610 

STRAIN 1169NT 

AAAAAAGT C AT CG AT T T AAAAAAACT AC AAAAAGC AT AC 
G C C T C AGAAAC T GT T T T AAAT AAT AT T AAT T T G G AGGT GT T T AAAG G C GA 
AAT AAT T G GAT T AAT AGG ACC CT CT GG AG C AG GG AAAT C T AC C T T GAT T A 
AAAC TAT G CT T G G CAT GG AAAAAG C AG AT AAGGGAAC AG CTCTTGTTCTT 
GAT ACT C AAAT G C C AG AT C GT AAT AT T T T AAAT C AAAT T GG CT AT AT GG C 
TCAATCTGATGCCTTATACGAATCTTTAACTGCCTTAGAAAATTTATTAT 
T CT TTGGAAAAATGAAAGGT ATT C AAAAAACT GAATTAAAAC AGC AG AT A 
ACT CAT AT T T C T AAAG T AGT AG AT CT AGAAAAC C AAC T T GAT AAAT T T GT 
CTCAGGTTACTCAGGAGGTATGAAAAGACGGCTTTCTCTAGCCATCGCCC 
T AC T T GGAAAC C C C AC AGT T T T AAT C CT AG AT G AAC CT AC C GT T G G AAT T 
GAT C C AT C C T T GAGGAG AAAAAT C T GG C AAG AG C T AAT T AAT AT TAAGGA 
T GAAG G AC GT T C TAT C T T TAT T AC AAC C C ACGT TAT G GAT G AAG C AG AAT 
T AAC AAGT AAG G T T G C ACT AC TAT T AC GT G G AAAC AT TAT T G C C T T T GAT 
ACTCCATTACATTTAAAAAAACAATTTAATGTGAGTACTATTGAGGAAGT 
TTTCTTAAAAGCTGAAGGAGAA 

SEQ ID NO. 7611 

STRAIN JM9130013 

AAAAAAGT C AT CGATTT AAAAAAACT AC AAAAAGC AT ATGCC 

T CAGAAAC C GT TT T AAAT AAT AT T AAT TT GGAG GTGTTT AAAG GC GAAAT 

AATTGGATTAATAGGACCCTCTGGAGCAGGGAAATCTACCTTGATTAAAA 

CTATGCTTGGCATGGAAAAAGCAGATAAGGGAACAGCTCTTGTTCTTGAT 

ACTCAAATGCCAGATCGTAATATTTTAAATCAAATTGGCTATATGGCTCA 

ATCTGATGCCTTATACGAGTCTTTAACTGGCTTAGAAAATTTATTATTCT 

T T GGAAAAAT G AAAGGT AT T C AAAAAACT GAATTAAAAC AG C AG AT AACT 

CAT AT T T C T AAAGT AGT AG AT CT AG AAAAC C AACT T GAT AAAT T T GT CT C 
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AGGTTACTCAGGAGGTATGAAAAGACGGCTTTCTCTAGCCATCGCCCTAC 
T T GG AAAC C C C AC AG T T T T AAT C CT AGAT GAAC C T AC CGT T GG AAT T GAT 
C CAT C CT T GAG G AG AAAAAT C T GG C AAG AGCT AAT T AAT AT T AAGG AT G A 
AG GACGT T C T AT CT T TAT T AC AAC C C AC GT T AT GG AT G AAG C AG AAT T AA 
CAAGTAAGGTTGCACTACTATTACGTGGAAACATTATTGCCTTTGATACT 
CCATTACATTTAAAAAAACAATTTAATGTGAGTACTATTGAGGAAGTTTT 
CTTAAAAGCTGAAGGAGAA 

SEQ ID NO. 7612 
STRAIN 2603 frame: 1 

KKVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALYESLTGLENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGHS I FITTHVMDEAELTSKVALLLRGNI IAFDT PLHLKKQFNV 

SEQ ID NO. 7613 

STRAIN 090 frame: 3 

LKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTALVLDT 
QMPDRNILNQIGYMAQSDALYESLTALENLLFFGKMKGIQKTELKQQITHISKWDLENQ 
LDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKDEGRSI 
FITTHVMDEAELTSKVALLLRGNIIAFDTPLHLKKQFNV 

SEQ ID NO. 7614 

STRAIN A909 frame: 1 

KKVIDLKKLQECAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDHNILNQIGYMAQSDALYESLTGLENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRS I FITTHVMDEAELTSKVALLLRGNI IAFDTPLHLKKQFNV 

SEQ ID NO. 7615 

STRAIN H3 6B frame: 1 

KKVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALYESLTGLENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRS I FITTHVMDEAELTSKVALLLRGNI IAFDTPLHLKKQFNV 

SEQ ID NO. 7616 

STRAIN 18RS21 frame: 1 

DLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTALVLD 
TQMPDRNILNQIGYMAQSDALYESLTGLENLLFFGKMKGIQKTELKQQITHISKWDLEN 
QLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKDEGHS 
I FI TTHVMDE AE LT S KVALLLRGN I IAFDT P LHLKKQ FN V 

SEQ ID NO. 7617 

STRAIN M7 32 frame: 1 

KKVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALHESLTGLENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRSIFITTHVMDEAELTSKVALLLRGNI IAFDTPLHLKKQFNV 

SEQ ID NO. 7618 

STRAIN COH1 frame: 1 

KKVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALHESLTGLENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRS I FITTHVMDEAELTSKVALLLRGNI IAFDTPLHLKKQFNV [ 

SEQ ID NO. 7619 

STRAIN M7 81 frame: 1 

KKVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALHESLTGLENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRS I FITTHVMDEAELTSKVALLLRGNI IAFDTPLHLKKQFNV 
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SEQ ID NO. 7 620 

STRAIN CJB110 frame: 1 

KKVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALYESLTALENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRSIFITTHVMDEAELTSKVALLLRGNIIAFDTPLHLKKQFNV 

SEQ ID NO. 7621 

STRAIN 1169NT frame: 1 

KBCVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALYESLTALENLLFFGKMKGIQKTELKQQITHISKVV 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRS I FITTHVMDEAELT SKVALLLRGNI IAFDT PLHLKKQFNV 

SEQ ID NO. 7622 

STRAIN JM9130013 frame: 1 

KKVIDLKKLQKAYASETVLNNINLEVFKGEIIGLIGPSGAGKSTLIKTMLGMEKADKGTA 
LVLDTQMPDRNILNQIGYMAQSDALYESLTGLENLLFFGKMKGIQKTELKQQITHISKW 
DLENQLDKFVSGYSGGMKRRLSLAIALLGNPTVLILDEPTVGIDPSLRRKIWQELINIKD 
EGRS I FITTHVMDEAELT SKVALLLRGNI IAFDT PLHLKKQFNV 

SEQ ID NO. 7701 
STRAIN 2603 

TTGCCTATGTTGTCTGTTGGTTTAGTTTTAGAGGGTGGCGGAATGAGAGGTCTTTATACT 
GCTGGAGTTTTAGATGCTTTTCTAGATGCAGGAATAAAAATAGATGGTATCGTATCTGTC 
TCTGCTGGTGCATTGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGA 
T AC AAT AAAAAG T AT TT AT C C C AC C C T AAAT AT AT G AGT C T AAGGT C AT GG T T T C G AAC A 
G GGAAT T T T GT T AAT AAAG AT T T C AC CT AT TAT G AAGT T C CT AT G AAAT T GG AT GT AT T T 
GACGATGAAGCATTTAAAAAATCAAGTATTGATTTTTACGTAGTTGCTACAGAGATGACA 
T CTGGTAAACCTGAAT ATTTTAAAAT T GAT AGTGT TT TT GAAC AAAT GGAAATTT TACGT 
G CT AGT T C AGC AT T AC C AGT AG T C T C AAAG AT G GT T GAT T GG C AGGG GAAAAAGT ACT T A 
GATGGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGATTTGACAAG 
TTGATTGTTGTGATGACTAGGCCGCTCAATTATCAGAAAAAGCCTTCAAGTGGACGATTG 
T AT AAAAC T C T GT AT AG GAAAT AT C CT AAT T T T GT AAAG AC AG C C T C G AAT CG GT AC C AA 
CAGTATAATAATAGTCTTGAAAAGGTCATGAGCCTTGAAAAAACAGGCGATCTATTTGCA 
ATTAGACCGAGTAAGAGCTTGGTTATTGGCCGCTTAGAGAAGAATCCGGATAAACTTGAT 
AGT AT T TAT C AG CT T GGT AT G AAAGAT GC T AAAAG T GT G AT G C C T GAG CT G AAT AGT TAT 
CTAATGAAA 

SEQ ID NO. 7702 

STRAIN 090 

CCTATGTTGTCTGTTGGTTTAGTTTTAG 

AGGGTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTTT 
C T AGAT G C AGG AAT AAAAAT AG AT G GT AT C GT AT CTGTCTCTGCTGGTGC 
ATTGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGAT 
ACAATAAAAAGTATTTATCCCACCCTAAATATATGAGTCTAAGGTCATGG 
T T T C GAAC AG G G AAT T T T GT T AAT AAAGAT T T C AC CT AT T AT G AAG T T C C 
TAT GAAAT T G GAT G TAT T T GACG AT G AAG CAT T T AAAAAAT C AAG T AT T G 
ATT T T T ACG T AG T T G CT AC AG AG AT G AC AT C T GG T AAAC C T G AAT AT T T T 
AAAATTGATAGTGTTTTTGAACAAATGGAAATTTTACGTGCTAGTTCAGC 
ATTACCAGTAGTCTCAAAGATGGTTGATTGGCAGGGGAAAAAGTACTTAG 
ATGGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGA 
TTTGACAAGTTGATTGTTGTGATGACTAGGCCGCTCAATTATCAGAAAAA 
G C C T T C AAGT G G AC GAT T G TAT AAAAC T CT G TAT AG GAAAT AT C C T AAT T 
T T G T AAAG AC AG C CT C G AAT CGGT AC C AAC AG T AT AAT AAT AG T C T T G AA 
AAG GT CAT GAG C C T T GAAAAAAC AG G C GAT C T AT T T G C AAT TAG AC CG AG 
T AAG AG C T T G GT T AT TGGCCGCT TAG A.GAAG AAT C C GG AT AAACT T GAT A 
GTATTTATCAGCTTGGTATGAAAGATGCTAAAAGTGTGATGCCTGAGCTG 
AAT AGTT AT CT AAT GAAA 

) 

SEQ ID NO. 7703 

STRAIN A909 

CCTATGTTGTCTGTTGGTTTAGTTTTAGAG 

GGTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTTTCT 
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AGAT G C AG G AAT AAAAGT AG AT G GT AT CAT AT CTGTCTCTGCT GGT G CAT 
TGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGATAC 
AATAAAAAGTATTTATCCCACCCTAAATATATGAGTCTAAGGTCATGGCT 
TCGAACAGGGAATTTTGTTAATAAAGATTTCACCTATTATGAAGTTCCTA 
T G AAAT T GG AT GT AT T T GAC GAT GAAG CAT T T AAAAAAT C AAGT AT T GAT 
T T T T AC GC AGT T GCT AC AG AG AT GAC AT C T GGT AAAC CT G AGT AT T T T AA 
AATTGATAGTGTTTTTGAACAAATGGAAATTTTACGTGCTAGTTCAGCAT 
TACCAGTAGTCTCAAAGATGGTTGTTTGGCAGGGGAAAAAGTACTTAGAT 
GGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGATT 
T GAC AAGT T GAT T GT T GT GAT G AC T AGG C C G C T C AAT TAT C AGAAAAAGC 
C T T C AAG T G GAC GAT T G T AT AAAACT C T GT AT AGGAAAT AT C C T AAT T T T 
GT AAAGAC AG C CT C G AAC CGG T AC C AAC AG T AT AAT AAT AGC CT T GAAAA 
GGT CAT G AGCC T T GAAAAAAC AG GC GAT CT AT T T G C AAT T AGAC C AAG T A 
AGAGCTTGGTTATTGGCCGCTTAGAGAAGAATCCGGATAAACTTGATAGT 

ATTTATCAGCTTGGTATGAAAGATGCTAAAAGTGGGATGCCTGAGCTGAA ) 
TAGTTATCTAATGAAA 

SEQ ID NO. 7704 

STRAIN H36B 

CCTATGTTGTCTGTTGGTTTAGTTTTAG 

AG GGT GGC GG AAT GAGAG GT CT T TAT AC T G CT G G AGT T T TAG AT G CT T TT 
CTAGATGCAGGAATAAAAGTAGATGGTATCATATCTGTCTCTGCTGGTGC 
ATTGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGAT 
AC AAT AAAAAGT AT T TAT C C C AC C C T AAAT AT AT G AGT CT AAGG T C AT GG 
C T T CG AAC AGG GAAT T T T GT T AAT AAAG AT T T C AC CT AT TAT G AAGT T C C 
TAT GAAAT T GG AT GT AT T T GAC GAT GAAG CAT T T AAAAAAT C AAG TAT T G 
AT T T T T AC G C AG T T GCT AC AGAG AT GAC AT C T GGT AAAC C T GAG TAT T T T 
AAAAT T GAT AGT GTT T T T G AAC AAAT GG AAAT T T T AC GT G C T AG T T C AG C 
AT T AC C AGT AG T C T C AAAG AT GGT T GT T T G G C AGG GG AAAAAGT AC T T AG 
ATGGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGA 
T T T GAC AAG T T GATT GT T GT GAT GAC TAG G C C G C T C AAT TAT C AG AAAAA 
G C C T T C AAGT GG ACG AT T GT AT AAAAC T CT G T AT AG GAAAT AT C C T AAT T 
T T GT AAAG AC AG C CT CG AAC C GG T AC C AAC AGT AT AAT AAT AG C CT T G AA 
AAG G T CAT G AGC CT T GAAAAAAC AG GC GAT C TAT T T G C AAT TAG AC C AAG 
TAAGAGCTTGGTTATTGGCCGCTTAGAGAAGAATCCGGATAAACTTGATA 
GTATTTATCAGCTTGGTATGAAAGATGCTAAAAGTGGGATGCCTGAGCTG 
AAT AGTTAT CT AAT GAAA 

SEQ ID NO. 7705 

STRAIN 18RS21 

CCTATGTTGTCTGTTGGTTTAGTTTTAGAGG 

GTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTTTCTA 
GATGCAGGAAT AAAAAT AGAT GGT ATCGTATCTGTCTCTGCTGGTGCATT 
GTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGATACA 
AT AAAAAGT AT T T AT C C C AC C C T AAAT AT AT G AGT C T AAGG T CAT G GT T T 
C G AAC AGGG AAT T T T GT T AAT AAAG AT T T C AC CT AT T AT GAAGT T C C T AT 
GAAAT T G GAT GT AT T T GACG AT G AAGC AT T T AAAAAAT C AAGT AT T GAT T 
T T T AC GT AGT T G C T AC AG AG AT GAC AT C T G GT AAAC C T GAAT AT T T T AAA 
AT T GAT AG T G T T T T T G AAC AAAT GG AAAT T T T AC GT G C T AGT T C AG CAT T 
AC C AGT AG T CT C AAAG AT GGT T GAT T G G C AG GGG AAAAAGT AC T TAG AT G 
GTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGATTT 
GACAAGTTGATTGTTGTGATGACTAGGCCGCTCAATTATCAGAAAAAGCC 
T T C AAGT GG AC G AT T G TAT AAAAC T C T GT AT AG GAAAT AT C C T AAT T T T G 
T AAAG AC AG C CT CGAAT C G G T AC C AAC AGT AT AAT AAT AGT C T T G AAAAG 
GTCATGAGCCTTGAAAAAACAGGCGATCTATTTGCAATTAGACCGAGTAA 
GAG C T T G GT T AT T GG C C G C T TAG AG AAG AAT C C G GAT AAAC T T GAT AG T A 

TTTATCAGCTTGGTATGAAAGATGCTAAAAGTGTGATGCCTGAGCTGAAT 
AGT T AT CT AAT GAAA 

SEQ ID NO. 7706 

STRAIN M732 

CCTATGTTGTCTGTTGGTTTAGTTTTAGA 

GGGTGGCGG7VATGAGAGGTCTT TAT ACT GCTGGAGTTTT AGAT GCTTTTC 
TAG AT G C AG GAAT AAAAAT AG AT G GT AT C GT AT CTGTCTCTGCGGGTG C A 
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TTGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGATA 
C AAT AAAAAGT AT T TAT C C C AC C CT G AAT AT AT GAGT C T AAG AT CAT G GC 
TTCGAACAGGGAATTTTGTTAATAAAGATTTCACCTATTATGAAGTTCCT 
AT G AAAT T GG ATGT AT T T GAC GAT GAAG CAT T T AAAAAAT C AAGT ATT GA 
TTTTTACGTAGTTGCTACAGAGATGACATCTGGTAAACCTGAATATTTTA 
AAAT T GAT AGT GT T T T T GAAC AAAT G G AAAT T T T AC GT GC T AGT T C AG C A 
TTACCAGTAGTCTCAAAGATGGTTGATTGGCAGGGGAAAAAGTACTTAGA 
TGGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGAT 
T T GAC AAGT T GAT T GT T GT G AT G AC T AGG C C G CT C AAT TAT C AGAAAAAG 
CCTTCAAGTGGACGATTGTATAAAACTCTGTATAGGAAATATCCTAATTT 
T GT AAAG AC AG C C T C GAAT C GGT AC C AAC AGT AT AAT AAT AG T C T T G AAA 
AGGT CAT GAG C C T T GAAAAAAC AGG CG AT C T AT T T G C AAT T AGAC C GAGT 
AAGAGCTTGGTTATTGGCCGCTTAGAGAAGAATCCGGATAAACTTGATAG 
TAT T T AT C AGC T T GGT AT GAAAT AT G C T AAAAG T GT GAT G C C T GAG C T GA 
ATAGTTATCTAATGAAA 

SEQ ID NO. 7707 

STRAIN COH1 

CCTATGTTGTCTGTTGGTTTAGTTTTA 

GAGGGTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTT 
TCTAGATGCAGGAATAAAAATAGATGGTATCGTATCTGTCTCTGCGGGTG 
CATTGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGA 
T AC AAT AAAAAGT AT TT AT CCCACCCT GAAT AT ATGAGTCTAAGATCATG 
GCTTCGAACAGGGAATTTTGTTAATAAAGATTTCACCTATTATGAAGTTC 
C T AT GAAAT T GG AT GT AT T T G ACGAT GAAG CAT T T AAAAAAT C AAGT AT T 
GAT T T T T AC G T AGT T G C T AC AG AG AT GAC AT CT G GT AAAC CT GAAT AT T T 
T AAAAT T GAT AGT GT T T T T GAAC AAAT GG AAAT T T T AC GT G C T AG T T C AG 
CAT T AC C AGT AGT C T C AAAG AT GG T T GAT T GG C AG G G G AAAAAGT ACT T A 
GATGGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGG 
ATTTGACAAGTTGATTGTTGTGATGACTAGGCCGCTCAATTATCAGAAAA 
AGCCTTCAAGTGGACGATTGTATAAAACTCTGTATAGGAAATATCCTAAT 
TTTGTAAAGACAGCCTCGAATCGGTACCAACAGTATAATAATAGTCTTGA 
AAAGGTCATGAGCCTTGAAAAAACAGGCGATCTATTTGCAATTAGACCGA 
GTAAGAGCTTGGTTATTGGCCGCTTAGAGAAGAATCCGGATAAACTTGAT 
AGTATTTATCAGCTTGGTATGAAATATGCTAAAAGTGTGATGCCTGAGCT 
GAAT AGT TAT CT AAT GAAA 

SEQ ID NO. 7708 

STRAIN M781 

CCTATGTTGTCTGTTGGTTTAGTTTTAG 

AGGGTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTTT 
C T AGAT G C AG G AAT AAAAAT AG AT G GT AT C GT AT CT G T CT CT G CGGGT G C 
ATTGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGAT 
AC AAT AAAAAGT AT T TAT C CC ACC CT GAAT AT AT GAGT C T AAG AT CAT G G 
C T T C GAAC AGG GAAT T T T GT T AAT AAAG AT T T C AC CT AT T AT G AAGT T C C 
TAT GAAAT T G GAT GT AT T T GAC GAT GAAG CAT T T AAAAAAT C AAGT AT T G 
AT T T T T AC G T AGT T G C T AC AG AG AT GAC AT C T GG T AAAC C T GAAT AT T T T 
AAAATTGATAGTGTTTTTGAACAAATGGAAATTTTACGTGCTAGTTCAGC 
ATTACCAGTAGTCTCAAAGATGGTTGATTGGCAGGGGAAAAAGTACTTAG 
ATGGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGA 
T T T GAC AAGT T GAT T GT T GT GAT GAC T AGG C C G C T C AAT T AT C AG AAAAA 
GCCTTCAAGTGGACGATTGTATAAAACTCTGTATAGGAAATATCCTAATT 
T T GT AAAG AC AG C C T C GAAT C G GT AC C AAC AGT AT AAT AAT AGT CT T G AA 
AAGG T CAT G AGC CT T GAAAAAAC AG G C GAT C T AT T T G C AAT TAG AC C GAG 
TAAGAGCTTGGTTATTGGCCGCTTAGAGAAGAATCCGGATAAACTTGATA 
GTATTTATCAGCTTGGTATGAAATATGCTAAAAGTGTGATGCCTGAGCTG 
AAT AGT T AT CT AAT GAAA 

SEQ ID NO. 7709 

STRAIN CJB110 

CCTATGTTGTCTGTTGGTTTAGTTTTA 

GAGGGTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTT 
TCTAGATGCAGGAATAAAAATAGATGGTATCGTATCTGTCTCTGCTGGTG 
CATTGTTTGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGA 
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TACAATAAAAAGTATTTATCCCACCCTAAATATATGAGTCTAAGGTCATG 
GTTTCGAACAGGGAATTTTGTTAATAAAGATTTCACCTATTATGAAGTTC 
CTATGAAATTGGATGTATTTGACGATGAAGCATTTAAAAAATCAAGTATT 
GAT T T T T ACGT AGT T G CT AC AG AG AT GAC AT CT GGT AAAC CT G AAT AT T T 
TAAAATTGATAGTGTTTTTGAACAAATGGAAATTTTACGTGCTAGTTCAG 
CATTACCAGTAGTCTCAAAGATGGTTGATTGGCAGGGGAAAAAGTACTTA 
GATGGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGG 
AT T T G AC AAGT T GAT T GT TGT G AT GAC T AGG C C G CT C AAT TAT C AG AAAA 
AG C CT T C AAGT G G ACG AT TGT AT AAAACT CT GT AT AG G AAAT AT C CT AAT 
TTTGTAAAGACAGCCTCGAATCGGTACCAACAGTATAATAATAGTCTTGA 
AAAGGT CAT GAG C CT T G AAAAAAC AG G C GAT CT AT T T GC AAT TAG AC C G A 
GTAAGAGCTTGGTTATTGGCCGCTTAGAGAAGAATCCGGATAAACTTGAT 
AG TAT T TAT C AG C T T GGT AT G AAAGAT G C T AAAAGT G T GAT G C C T G AGCT 
GAATAGT TAT CT AATGAAA 

SEQ ID NO. 7710 

STRAIN 1169NT 

CCTATGTTGTCTGTTGGTTTAGTTTTAGAGGGTG 

GC GGAAT G AGAGGT C T T TAT AC T G C T G G AGT T T T AGAT G CT T T T C T AGAT 
GCAGGAATAAAAATAGATGGTATCGTATCTGTCTCTGCGGGTGCATTGTT 
TGGTGTTAATTTTGTATCTAGACAACGAGAGAGGGCTTTGCGATACAATA 
AAAAGT AT T TAT C CC AC C CT AAAT AT AT G AGT C T AAG AT C AT GG CT T C G A 
AC AGG G AAT T T T GT T AAT AAAGAT T T C AC C TAT TAT G AAGT T C CT AT GAA 
AT T G G AT GT AT T T G ACG AT G AAG CAT T T AAAAAAT C AAGT AT T GAT T T TT 
ACGCAGTTGCTACAGAGATGACATCTGGTAAACCTGAATATTTTAAAATT 
GATAGTGTCTTTGAACAAATGGAAATTTTACGTGCTAGTTCAGCATTACC 
AGT AGT C T C AAAGAT GG T T GAT T GG C AGG G G AAAAAGT AC T T AG AT GG T G 
GTTTATCTGATAGTATCCCCGTTGATTTTGCCCGTGGTTTAGGATTTGAC 
AAGTTGATTGTTGTGATGACTAGGCCGCTCAATTATCAGAAAAAGCCTTC 
AAGTGGACGATTGTATAAAACTCTGTATAGGAAATATCCTAATTTTGTAA 
AGAC AGCCT CGAAT C GGTACCAACAGT AT AAT AAT AG CCTT GAAAAGGT C 
ATGAGCCTTGAAAAAACAGGCGATCTATTTGCAATTAGGCCGAGTAAAAG 
CT T G G T TAT TGTCCGCT T AGAG AAG AAT C C GGAT AAAC T T GAT AGT AT TT 
ATCAGCTTGGTATGAAAGATGCTAAAAGTGTGATGCCTGAGCTGAATAGT 
TAT CT AATGAAA 

SEQ ID NO. 7711 

STRAIN JM9130013 

CCTATGTTGTCTGTTGGTTTAGTTTTAGAG 

GGTGGCGGAATGAGAGGTCTTTATACTGCTGGAGTTTTAGATGCTTTTCT 
AGATGCAGGAATAAAAGTAGATGGTATCATATCTGTCTCTGCTGGTGCAT 
TGTTTGGT GT T AAT T T T GT AT C T AG AC AAC GAGAGAG GGC T T T G CG AT AC 
AATAAAAAGTATTTATCCCACCCTAAATATATGAGTCTAAGGTCATGGCT 
T CG AAC AGG G AAT T T T GT T AAT AAAGAT T T C AC C T AT TAT G AAGT T C C T A 
TGAAATTGGATGTATTTGACGATGAAGCATTTAAAAAATCAAGTATTGAT 
T T T T AC G C AG T T GCT AC AG AG AT GAC AT CT GGT AAAC C T GAG TAT T T T AA 
AATTGATAGTGTTTTTGAACAAATGGAAATTTTACGTGCTAGTTCAGCAT 
TACCAGTAGTCTCAAAGATGGTTGTTTGGCAGGGGAAAAAGTACTTAGAT 
GGTGGTTTATCTGATAGTATTCCCGTTGATTTTGCCCGTGGTTTAGGATT 
T GAC AAGT T GAT T G T T G T GAT GAC TAG G CC G CT C AAT TAT C AG AAAAAG C 
CTTCAAGTGGACGATTGTATAAAACTCTGTATAGGAAATATCCTAATTTT 
GT AAAG AC AG C C T C G AAC C GGT AC C AAC AGT AT AAT AAT AG CCTT G AAAA 
GGT CAT GAG CCTT G AAAAAAC AG G C GAT CT AT T T G C AAT TAG AC C AAGT A 
AGAG C T T G GT T AT TGGCCGCT TAG AG AAG AAT C CG GAT AAAC T T GAT AG T 
AT T TAT C AG CT T GGT AT G AAAG AT G CT AAAAG T G G GAT G C C T GAG C T GAA 
TAGTTATCTAATGAAA 

SEQ ID NO. 7712 
STRAIN 2603 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKIDGIVSVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWFRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYVVATEMTS 
GKPEYFKIDSVFEQMEILRASSALPVVSKMVDWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKDAKSVMPELNSYLMK 
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SEQ ID NO. 7713 

STRAIN 090 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKIDGIVSVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWFRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYWATEMTS 
GKPEYFKIDSVFEQMEILRAS SALPWSKMVDWQGKKYLDGGLSDSIPVDFARGLGFDKL 
I WMT R P LN YQKK P S S GRL YKT L YRK Y PN FVKT A S N R YQQ YNN S LE KVM S LE KT G D L FAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKDAKSVMPELNSYLMK 

SEQ ID NO. 7714 

STRAIN A909 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKVDGIISVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWLRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYAVATEMTS 
GKPEYFKIDSVFEQMEILRASSALPWSKMWWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKDAKSGMPELNSYLMK 

SEQ ID NO. 7715 

STRAIN H3 6B frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKVDGIISVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWLRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYAVATEMTS 
GKPEYFKIDSVFEQMEILRASSALPWSKMWWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKDAKSGMPELNSYLMK 

SEQ ID NO. 7716 

STRAIN 18RS21 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKIDGIVSVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWFRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYWATEMTS 
GKPEYFKIDSVFEQMEILRASSALPWSKMVDWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKDAKSVMPELNSYLMK 

SEQ ID NO. 7717 

STRAIN M732 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKIDGIVSVSAGALFGVNFVSRQRERALRY 
NKKYLSHPEYMSLRSWLRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYVVATEMTS 
GKPEYFKIDSVFEQMEILRASSALPWSKMVDWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKYAKSVMPELNSYLMK 

SEQ ID NO. 7718 

STRAIN COH1 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKIDGIVSVSAGALFGVNFVSRQRERALRY 
NKKYLSHPEYMSLRSWLRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYWATEMTS 
GKPEYFKIDSVFEQMEILRASSALPWSKMVDWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKYAKSVMPELNSYLMK 

SEQ ID NO. 7719 

STRAIN M781 frame: 1 

PML S VG L VLE G G GMRG L YT AG V L DA F L DAG IKIDGIVSVS AG AL FG VN FV S RQRE RALR Y 
NKKYLSHPEYMSLRSWLRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYWATEMTS 
GKPEYFKIDSVFEQMEILRASSALPVVSKMVDWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKYAKSVMPELNSYLMK 

SEQ ID NO. 7720 

STRAIN CJB110 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKIDGIVSVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWFRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYVVATEMTS 
GKPEYFKIDSVFEQMEILRASSALPVVSKMVDWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKDAKSVMPELNSYLMK 



341 



WO 2004/018646 



PCT/US2003/026827 



SEQUENCE LISTING 



SEQ ID NO. 7721 

STRAIN JM9130013 frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKVDGIISVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWLRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYAVATEMTS 
GKPEYFKIDSVFEQMEILRASSALPWSKMWWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIGRLEKNPDKLDSIYQLGMKDAKSGMPELNSYLMK 

SEQ ID NO. 7722 

STRAIN 1169NT frame: 1 

PMLSVGLVLEGGGMRGLYTAGVLDAFLDAGIKIDGIVSVSAGALFGVNFVSRQRERALRY 
NKKYLSHPKYMSLRSWLRTGNFVNKDFTYYEVPMKLDVFDDEAFKKSSIDFYAVATEMTS 
GKPEYFKIDSVFEQMEILRASSALPWSKMVDWQGKKYLDGGLSDSIPVDFARGLGFDKL 
IWMTRPLNYQKKPSSGRLYKTLYRKYPNFVKTASNRYQQYNNSLEKVMSLEKTGDLFAI 
RPSKSLVIVRLEKNPDKLDSIYQLGMKDAKSVMPELNSYLMK 

SEQ ID NO. 7801 
STRAIN 2603 

AT G AAAGT T T T AGT AG T T GAT GAT GAAC C AG T T G C AC GT AAC G AAT T AAT T TAG C TT C T T 
AATAAGTATGATTCTAACCTCGTTATAGCAGAGGCGCATGATATGGCTACTGCATTAGCT 
AT T T T AC T T AGAG AAACT T T T GAT GT AG C AC T GT T AGAT AT C CAT C T C AG AGAT GAT T C T 
GGGTTGCAATTAGCAGAGTATATCAATAAAATGCCCAAACCACCATTATTGATATTTGCG 
ACTGCTTATGATCAATATGCTATTCAGGCTTTTGAGCATGATGCGCGTGATTATTTGTTA 
AAAC C C T AT GAT T T T G AT AG GC T AAAG C AAG CT AT GG AT AG AG T AAAAG GAG C GC T AAGT 
AC AT CT AC AAT TAT AGAG AG C G T AACT TCCGGTCCTCTCTT C AAG C AAC AGT AT C CAT T G 
ACAGTAGAAGATCGAATCTATCTGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATG 
C AAG G AAAAC T GAT TAT AC AAAC AC CT G AT AAAAAT TAT G AAAT T GAT GG C T C T C T AC AA 
CAATGGCAAGATAAACTACCATCATCTCAATTTGTACGGGTACATCGCTCTTACATTGTG 
AAC AT T AAT G C T AT T AAAAC GAT T GAAC C T T G GT T T AAC C AAAC ACT T C AGT T AC AC CT T 
T GT AAT AAAAT AAC AGT T CCTGT T AGC AG AG CAAAT GT AAAAC C CC T AAAAC AAATGT T A 
GGCATATCTACC 

SEQ ID NO. 7802 

STRAIN 090 

AAAG T T T T AGT AGT T GAT GAT G AAC C AGT T G C AC GT AA 
CGAATTAATTTACCTTCTTAATAAGTATGATTCTAACCTCGTTATAGCAG 
AGG C GC AT GAT AT G G C T AC T GC AT T AG C TAT T T T AC T T AGAG AAAC T T T T 
G AT GT AG C ACT GT TAG AT AT C CAT C T C AGAG AT GAT TCTGGGTTG C AAT T 
AG C AG AGT AT AT C AAT AAAAT G CC C AAAC C AC C AT TAT T GAT AT T T GC G A 
C T G CT TAT GAT C AAT AT G C TAT T C AGG C T T T T GAG CAT GAT G C G CGT G AT 
TAT T T GT T AAAAC C C T AT GAT T T T GAT AG G C T AAAGC AAG C T AT GG AT AG 
AGTAAAAGGAGCGCTAAGTACATCTACAATTATAGAGAGCGTAACTTCCG 
GTCCTCTCTT C AAG C AAC AG TAT C CAT T G AC AGT AG AAG AT C G AAT CT AT 
CTGGTGTCGGC GG AT GAT AT C C T T T T GAT T GAAG C TAT G C AAG G AAAACT 
GAT TAT AC AAAC AC C T GAT AAAAAT TAT GAAAT T GAT GG C T C T C T AC AAC 
AATGGCAAGATAAACTACCATCATCTCAATTTGTACGGGTACATCGCTCT 
TACATTGTGAACATTAATGCTATTAAAACGATTGAACCTTGGTTTAACCA 
AAC AC T T C AGT T AC AC C T T T GT AAT AAAAT AAC AG T T C C T G T T AG C AG AG 
CAAAT GT AAAAC C C C T AAAAC AAAT GT T AG G CAT AT CT ACC 

SEQ ID NO. 7803 

STRAIN A909 

AAAGTTTTAGTAGTTGATGATGAACCAGTTGCACGTAAC 

GAATTAATTTACCTTCTTAATAAGTATGATTCTAACCTCGTTATAGCAGA 

GGCGCATGATATGGCTACTGCATTAGCTATTTTACTTAGAGAAACTTTTG 

ATGTAGCACTGTTAGATATCCATCTCAGAGATGATTCTGGGTTGCAATTA 

GC AGAGT AT AT C AAT AAAAT GCCC AAAC CAC CAT TAT T GAT AT TCGCGAC 

TGCTTATGATCAATATGCTATTCAAGCTTTTGAGCATGATGCGCGTGATT 

ATTTGTTAAAACCCTATGAGTTTGATAGGCTAAAGCAAGCTATGGATAGA 

GT AAAAG GAG C G C T AAG T AC AT CT AC AAT T AT AG AG AG C GT AACT T C C G G 

CC CT CT C T T C AAG C AAC AG TAT C C AT T G AC AGT AGAAG AT C G AAT C T AT C 

TGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCAAGGAAAACTG 

ATTATACAAACACCTGATAAAAATTATGAAATTGATGGCTCTCTACAACA 
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AT G G C AAG AT AAAC T AC CAT CAT C T C AAT T T GT AC GG GT G C AC C G CT C T T 
ACATTGTGAATATTAATGCTATTAAAACGATTGAACCTTGGTTTAACCAA 
AC AC T T C AGT T AC AC C T T T GT AAT AAAAT AAC AGT T C C T GT TAG C AG AGC 
AAATGTAAAACCCCTAAAACAAATGTTAGGCATATCTACC 

SEQ ID NO. 7804 

STRAIN H36B 

AAAGT T T T AGT AGT T GAT GAT GAAC C AGT T GC ACGT 

AACGAATTAATTTACCTTCTTAATAAGTATGATTCTAACCTCGTTATAGC 
AGAG G C G CAT GAT AT G GCT AC T G CAT TAG C T AT TT T AC T TAG AG AAAC T T 
T T GAT GT AG C ACT GT TAG AT AT C CAT CT C AG AG AT GAT TCTGGGTTG C AA 
T T AGC AGAGT AT AT C AAT AAAAT GC C C AAAC C AC CAT TAT T GAT AT T C GC 
GACTGCTTATGATCAATATGCTATTCAAGCTTTTGAGCATGATGCGCGTG 
AT TAT T T GT T AAAAC C CT AT G AGT T T GAT AG GCT AAAG C AAG C T AT G GAT 
AGAGTAAAAGGAGCGCTAAGTACATCTACAATTATAGAGAGCGTAACTTC 
CGGCCCTCTCTTCAAGCAACAGTATCCATTGACAGTAGAAGATCGAATCT 
ATCTGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCAAGGAAAA 
C T GAT TAT AC AAAC AC CT G AT AAAAAT T AT G AAAT T GAT G G C T CT CT AC A 
AC AAT G GC AAG AT AAAC T AC CAT CAT C T C AAT T T GT ACGG G T G C AC C G CT 
C T T AC AT T GT G AAT AT T AAT G C T AT T AAAACGAT T G AAC CT T G GT T T AAC 
C AAAC AC T T C AGTT AC AC CT T T GT AAT AAAAT AAC AGT T C CT GT T AG C AG 
AGCAAATGTAAAACCCCTAAAACAAATGTTAGGCATATCTACC 

SEQ ID NO. 7805 

STRAIN 18RS21 

AAAGT T T T AGT AGT T GAT GAT GAAC C AGT T G C AC GT AAC 
GAATTAATTTACCTTCTTAATAAGTATGATTCTAACCTCGTTATAGCAGA 
GGCGCATGATATGGCTACTGCATTAGCTATTTTACTTAGAGAAACTTTTG 
AT GT AG C ACT GT T AGAT AT C CAT C T C AG AG AT GAT TCTGGGTTG C AAT T A 
G C AG AGT AT AT C AAT AAAAT G C C C AAAC C AC CAT TAT T GAT AT T T GC G AC 
TGCTTATGATCAATATGCTATTCAGGCTTTTGAGCATGATGCGCGTGATT 
AT T T GT T AAAAC C CT AT GAT T T T GAT AG G CT AAAGC AAG CT AT G GAT AG A 
GTAAAAGGAGCGCTAAGTACATCTACAATTATAGAGAGCGTAACTTCCGG 
TCCTCTCTT C AAG C AAC AGT AT C C AT T G AC AGT AG AAG AT C GAAT C T AT C 
TGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCAAGGAAAACTG 
ATT AT AC AAAC AC CT GAT AAAAAT TAT GAAAT T GAT GGCT CT CT AC AAC A 
ATGGCAAGATAAACTACCATCATCTCAATTTGTACGGGTACATCGCTCTT 
AC AT T GT GAAC AT T AAT G CT AT T AAAAC GAT T GAAC C T T GGT T T AAC C AA 
AC ACT T C AGT T AC AC C T T T GT AAT AAAAT AAC AGT T C CT GT T AG C AG AGC 
AAAT GT AAAAC C C CT AAAAC AAAT GTTAGG CAT AT CT AC C 

SEQ ID NO. 7806 

STRAIN M732 

AAAGT T TT AGT AGTT GAT G ATG AAC C AGTT 

GCACGTAACGAATTAATTTACCTTCTTAATAAGTATGATTCTAACCTCGT 
TAT AG C AG AG G CG C AT GAT AT G G CT AC T G CAT TAG C T AT T T TACT TAG AG 
AAACT T T T GAT GT AG C AC T GT T AG AT AT C CAT CT C AG AG AT GAT T CT G GG 
T T G C AAT T AGC AG AGT AT AT C AAT AAAAT G C C C AAAC C AC CAT TAT T GAT 
AT T C G C G AC T G CT TAT GAT C AAT AT G C T AT T C AGG CT T T T G AG C AGG AT G 
CGCGTGATTATTTGTTAAAACCCTATGAGTTTGATAGGTTAAAGCAAGCT 
AT G G AT AG AG T AAAAG GAG C G C T AAG T AC AT C T AC AAT TAT AG AG AG C GT 
AG C T T C C GGT CCTCTCTT C AAGC AAC AG T AT CC AT T G AC AG TAG AAG AT C 
GAATCTATCTGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCAA 
G G AAAAC T GAT TAT AC AAAC AC C T GAT AAAAAT TAT GAAAT T G AT GGC T C 
TCTACAACAATGGCAAGATAAA.CTACCATCATCTCAATTTGTACGGGTAC 
ATCGCTCTTACATTGTGAATATTAATGCTATTAAAACGATTGAACCTTGG 
TTTAACCAAACACTTCAGTTACACCTTTGTAATAAAATAACAGTTCCTGT 
TAGCAGAGCAAATGTAAAACCCCTAAAACAAATGTTAGGCATATCTACC 

SEQ ID NO. 7807 

STRAIN COH1 

AAAG T T T TAG T AGT T GAT GAT G AAC C AGT T G C AC G T A 

ACGAATTAATTTACCTTCTTAATAAGTATGATTCTAACCTCGTTATAGCA 

GAGGCGCATGATATGGCTACTGCATTAGCTATTTTACTTAGAGAAACTTT 
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T G AT GT AG C AC T G T T AGAT AT C CAT CT C AG AG AT GAT TCTGGGTTG C AAT 
TAGCAGAGTATATCAATAAAATGCCCAAACCACCATTATTGATATTCGCG 
ACTGCTTATGATCAATATGCTATTCAGGCTTTTGAGCAGGATGCGCGTGA 
TTATTTGTTAAAACCCTATGAGTTTGATAGGTTAAAGCAAGCTATGGATA 
GAGTAAAAGGAGCGCTAAGTACATCTACAATTATAGAGAGCGTAGCTTCC 
GGTCCTCTCTT C AAGC AAC AGT AT C CAT T G AC AGT AGAAGAT C G AAT C T A 
TCTGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCAAGGAAAAC 
T GAT T AT AC AAAC AC C T G AT AAAAAT T AT G AAAT T GAT G GCT C T C T AC AA 
C AAT GG C AAG AT AAAC T AC CAT CAT CT C AAT T T GT AC GG GT AC AT CG C T C 
T T AC AT T G T G AAT AT T AAT GC T AT T AAAACG AT T G AAC C T T GG T T T AAC C 
AAACACTTCAGTTACACCTTTGTAATAAAATAACAGTTCCTGTTAGCAGA 
GCAAATGTAAAACCCCTAAAACAAATGTTAGGCATATCTACC 

SEQ ID NO. 7808 

STRAIN M781 

AAAGT T T T AGT AGT T GAT GAT G AAC C AGT T G C ACGT AAC 

G AAT T AAT T T AC C T T CT T AAT AAGT AT GAT T CT AAC CT C GT T AT AGC AG A 

GGCGCATGATATGGCTACTGCATTAGCTATTTTACTTAGAGAAACTTTTG 

ATGTAGCACTGTTAGATATCCATCTCAGAGATGATTCTGGGTTGCAATTA 

G C AG AGT AT AT C AAT AAAAT G C C C AAAC C AC CAT TAT T GAT AT T C GC G AC 

T GC T T AT GAT C AAT AT G CT AT T C AG G CT T T T GAG C AGG AT GCG C GT GAT T 

AT T T GT T AAAAC C CT AT GAGT T T GAT AG G T T AAAG C AAG CT AT GGAT AG A 

GTAAAAGGAGCGCTAAGTACATCTACAATTATAGAGAGCGTAGCTTCCGG 

T CC T CT CT T C AAGC AAC AGT AT C CAT T G AC AGT AG AAG AT C G AAT CT AT C 

TGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCAAGGAAAACTG 

ATTATACAAACACCTGATAAAAATTATGAAATTGATGGCTCTCTACAACA 

ATGGCAAGATAAACTACCATCATCTCAATTTGTACGGGTACATCGCTCTT 

ACATTGTGAATATTAATGCTATTAAAACGATTGAACCTTGGTTTAACCAA 

ACACTTCAGTTACACCTTTGTAATAAAATAACAGTTCCTGTTAGCAGAGC 

AAAT G T AAAAC C C C T AAAAC AAAT G T T AG G CAT AT CT AC C 

SEQ ID NO. 7809 

STRAIN CJB110 

CTTAATAAGTATGATTCTAACCTCGTTATAGCAGAGGCGCATGATATGGC 
TACT GC AT T AG CT AT T T T AC T TAG AG AAAC T T T T GAT GT AG C AC T GT TAG 
AT AT C CAT C T C AGAG AT GAT TCTGGGTTG C AAT TAG C AG AGT AT AT C AAT 
AAAATGCCCAAACCACCATTATTGATATTCGCGACTGCTTATGATCAATA 
TGCTATTCAAGCTTTTGAGCATGATGCGCGTGATTATTTGTTAAAACCCT 
ATGAGTTTGATAGGCTAAAGCAAGnTATGGATAGAGTAAAAGGAGCGCTA 
AGT AC AT CT AC AATT AT AGAGAGCGT AACTT CCGGC C CT CT CTT CAAGC A 
AC AGT AT C CAT T G AC AG TAG AAG AT nG AAT CT AT CTGGTGTCGGCG GAT G 
AT AT C CT T T T GAT T G AAG C T AT G C AAG G AAAAC T GAT T AT AC AAAC ACCT 
GAT AAAAAT TAT G AAAT T GAT G G C T CT C T AC AAC AAT GG C AAGAT AAAC T 
ACCATCATCTCAATTTGTACGGGTGCACCGCTCTTACATTGTGAATATTA 
AT G CT ATT AAAAC GAT T G AAC C T T GGT T T AAC C AAAC AC T T C AGT T AC AC 
CTT T GT AAT AAAAT AAC AG T T C C T GT TAG C AG AG C AAAT GT AAAAC C C C T 
AAAACAAATGTTAGG 

SEQ ID NO. 7810 

STRAIN 1169NT 

AAAGT T T TAG TAG T T GAT GAT G AAC C AG 

TTGCACGTAACGAATTAATTTATCTTCTT AAT AAGT ATGATTCTAACCTC 
GTTATAGCAGAGGCGCATGATATAGCTACTGCATTAGCTATTTTACTTAG 
AG AAAC T T T T GAT G T AG C AC T G T TAG AT AT C CAT CT C AG AG AT GAT T CT G 
G G T T G C AAT TAG C AG AG TAT AT C AAT AAAAT G CC C AAAC C AC CAT T ATT G 
AT&TTCGCGACTGCTTATGATCAATATGCTATTCAGGCTTTTGAGCATGA 
TGCGCGTGATTATTTGTTAAAACCCTATGAGTTTGATAGGCTAAAGCAAG 
C T AT G GAT AG AG T AAAAG GAG C G C T AAG T AC AT C T AC AAT TAT AGAG AGC 
G T AAC TTCCGGCC CT C T C T T C AAG C AAC AG TAT C CAT T G AC AGT AG AAG A 
TCGAATCTATCTGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGC 
AAG GAAAACT GAT TAT AC AAAC AC C T GAT AAAAAT TAT G AAAT T GAT G GC 
T C T CT AC AAC AAT GG C AAG AT AAACT AC CAT CAT C T C AAT TTGTACGGGT 
GCACCGCT CTT AC AT TGTG AAT ATT AAT GCT ATT AAAACG ATTGAACCTT 
GGTTTAACCAAACACTTCAGTTACACCTTTGTAATAAAATAACAGTTCCT 
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GT T AG C AGAG C AAAT GT AAAAC C C CT AAAAC AAAT GT T AGG C AT AT C T AC 
C 

SEQ ID NO. 7811 

STRAIN JM9130013 

AAAGT T T T AGT AGT T GAT GAT G AAC C AGT 

TGCACGTAACGAATTAATTTACCTTCTTAATAAGTATGATTCTAACCTCG 
T TAT AG C AG AGG C G CAT GAT AT GG C T AC T G CAT TAG C T ATT T TACT TAG A 
G AAAC T T T T GAT GT AG C AC T GT T AGAT AT C C AT CT C AGAG AT GAT T C T GG 
GTTGCAATT AGCAGAGT AT AT CAATAAAAT GCCCAAACC AC CATT ATT GA 
TATTCGCGACTGCTTATGATCAATATGCTATTCAAGCTTTTGAGCATGAT 
GCGCGTGATTATTTGTTAAAACCCTATGAGTTTGATAGGCTAAAGCAAGC 
T ATGGAT AGAGTAAAAGGAG CGCTAAGT ACAT CT ACAAT T AT AGAGAGCG 
TAACTTCCGGCCCTCTCTTCAAGCAACAGTATCCATTGACAGTAGAAGAT 
CGAATCTATCTGGTGTCGGCGGATGATATCCTTTTGATTGAAGCTATGCA 
AGGAAAACTGATTATACAAACACCTGATAAAAATTATGAAATTGATGGCT 
CT C T AC AAC AAT GG C AAG AT AAAC T AC C AT CAT C T C AAT TTGTACGGGTG 
CACCGCTCTTACATTGTGAATATTAATGCTATTAAAACGATTGAACCTTG 
GT T T AAC C AAAC AC T T C AG T T AC AC C T T T GT AAT AAAAT AAC AG T T C C T G 
T T AG C AG AGC AAAT GT AAAAC C CCT AAAAC AAAT GTT AGG CAT AT CT AC C 

SEQ ID NO. 7812 
STRAIN 2 603 frame: 1 

KVLWDDE PVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEHDARDYLLKPYDFDRLKQAMDRVKGALST 
STIIESVTSGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7813 

STRAIN 0 90 frame: 1 

KVLWDDEPVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEHDARDYLLKPYDFDRLKQAMDRVKGALST 
STIIESVTSGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7814 

STRAIN A90 9 frame: 1 

KVLVVDDEPVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEHDARDYLLKPYEFDRLKQAMDRVKGALST 
STIIESVTSGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7815 

STRAIN H3 6B frame: 1 

KVLWDDE PVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEHDARDYLLKPYEFDRLKQAMDRVKGALST 
STIIESVTSGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7816 

STRAIN 18RS21 frame: 1 

KVLVVDDEPVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEHDARDYLLKPYDFDRLKQAMDRVKGALST 
STIIESVTSGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSEIANVKPLKQMLG 
1ST 

SEQ ID NO. 7817 

STRAIN M7 32 frame: 1 

KVLV V DDE P V ARNE L I Y L LNK Y DSN L V I AE AH DMAT AL A ILL RE T F D VAL LDIHLRDDSG 
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LQLAEYINKMPKPPLLIFATAYDQYAIQAFEQDARDYLLKPYEFDRLKQAMDRVKGALST 
STIIESVASGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7818 

STRAIN COH1 frame: 1 

KVLWDDEPVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEQDARDYLLKPYEFDRLKQAMDRVKGALST 
STIIESVASGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7819 

STRAIN M781 frame: 1 

KVLWDDEPVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEQDARDYLLKPYEFDRLKQAMDRVKGALST 
STIIESVASGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7820 

STRAIN CJB110 frame: 1 

LNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSGLQLAEYINKMPKPPLLIF 
ATAYDQYAIQAFEHDARDYLLKPYEFDRLKQXMDRVKGALSTSTIIESVTSGPLFKQQYP 
LTVEDXIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQWQDKLPSSQFVRVHRSYI 
VNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQML 

SEQ ID NO. 7821 

STRAIN 1169NT frame: 1 

KVLVVDDEPVARNELIYLLNKYDSNLVIAEAHDIATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEHDARDYLLKPYEFDRLKQAMDRVKGALST 
STIIESVTSGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7822 

STRAIN JM9130013 frame: 1 

KVLWDDEPVARNELIYLLNKYDSNLVIAEAHDMATALAILLRETFDVALLDIHLRDDSG 
LQLAEYINKMPKPPLLIFATAYDQYAIQAFEHDARDYLLKPYEFDRLKQAMDRVKGALST 
STIIESVTSGPLFKQQYPLTVEDRIYLVSADDILLIEAMQGKLIIQTPDKNYEIDGSLQQ 
WQDKLPSSQFVRVHRSYIVNINAIKTIEPWFNQTLQLHLCNKITVPVSRANVKPLKQMLG 
1ST 

SEQ ID NO. 7 901 
STRAIN 2603 

ATGGGAATTGAATTTAAAAATGTAAGTTATACCTATCAAGCCGGCACTCCTTTTGAAGGG 
CGTGCCCTTTTTGACGTCAATCTGAAAATTGAAGATGCTTCCTATACCGCGTTCATTGGG 
C AC AC AG GT T CT G G AAAAT C AAC T AT TAT G C AAC T T T T G AAT G G T T T AC AT AT T C C TAG A 
AAAGGTGAGGTAATTGTCGATGATTTTTCTATTAAAGCAGGGGACAAGAACAAAGAAATC 
AAATTTATAAGGCAAAAAGTTGGTTTAGTTTTTCAATTTCCAGAAAGTCAGCTTTTTGAA 
GAGACAGTTTTAAAGGATGTTGCTTTTGGACCACAAAATTTTGGTATTTCTCAGATTGAA 
G CT G AAAG G C T GG CT G AAG AAAAAT T AAGG T T AGT T GGT AT C AG T GAG GAT T T AT T CG AT 
AAAAATCCATTTGAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTA 
G C GAT GG AAC C C AAAG TAG T AGT AC T GG AT GAG C C AAC AG CT G G AC T T GAT C CT AAG G G A 
AGAAAAGAATTAATGACTCTTTTTAAAAATCTTCATAAAAAAGGAATGACTATCGTCTTA 
GT G AC T C ACT T AAT GG ACG AT GT AG C GG AT TAT G C T G AC TAT G T GT AT G T T T T AG AAGC A 
GGG AAAGT AAC C T TAT C AG G AC AAC C AAAAC AG AT T T T T C AAG AAGT AGAAC T T T T AG AA 
AGTAAACAATTAGGAGTTCCCAAAATCACCAAGTTTGCTCAAAGACTATCTCATAAGGGA 
TTAAATTTACCTAGTTTACCAATTACTATTAACGAATTTGTGGAGGCTATTAAGCATGGA 

SEQ ID NO. 7902 
STRAIN 090 

G G AAT T G AAT T T AAAAAT GT AAGT TAT AC C T AT C AAG C C 
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GGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATCTGAAAATTGA 
AGATGCTTCCTATACCGCGTTCATTGGGCACACAGGTTCTGGAAAATCAA 
CTATTATGCAACTTTTGAATGGTTTACATATTCCTACAAAAGGTGAGGTA 
AT T GT CG AT GAT T T T T CT AT T AAAG CAGGGGAC AAG AAC AAAG AAAT C AA 
ATTTATAAGGCAAAAAGTTGGTTTAGTTTTTCAATTTCCAGAAAGTCAGC 
TTTTTGAAGAGACAGTTTTAAAGGATGTTGCTTTTGGACCACAAAATTTT 
GGT AT T T C T C AG AT T G AAGC T GAAAGG CT GG CT G AAGAAAAAT T AAG G T T 
AGTTGGTATCAGTGAGGATTTATTCGATAAAAATCCATTTGAACTTTCTG 
GAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGCGATGGAACCC 
AAAGTACTAGTACTGGATGAGCCAACAGCTGGACTTGATCCTAAGGGAAG 
AAAAGAAT T AAT G ACT CT T T T T AAAAAT CT T C AT AAAAAAGG AAT G ACT A 
TCGTCTTAGTGACTCACTTAATGGACGATGTAGCGGATTATGCTGACTAT 
GT GT AT GT T T TAG AAG C AGGG AAAGT AAC CT TAT C AGG AC AAC C AAAAC A 
GAT T T T T C AAG AAGT AGAAC T T T T AG AAAGT AAAC AAT TAG G AGT T C C C A 
AAAT C AC C AAGT T T G C T C AAAG AC TAT CT CAT AAG G G AT T AAAT T T AC C T 
AGT T T AC C AAT T AC TAT T AAC G AAT T T GT GGAG G C TAT T AAG CAT G G A 

SEQ ID NO. 7903 

STRAIN A90 9 

GGAATTGAATTT AAAAAT GTAAGTT AT ACCTATCAA 

GCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATCTGAAAAT 
TGAAGATGCTTCCTATACCGCGTTCATTGGGCACACAGGTTCTGGAAAAT 
C AACT AT T AT GC AAC T T T T G AAT GGT T T AC AT AT T C CT AC AAAAGG T GAG 
GTAATTGTCGATGATTTTTCTATTAAAGCAGGGGACAAGAACAAAGAAAT 
CAAATTTATAAGGCAAAAAGTTGGTTTAGTTTTTCAATTTCCAGAAAGTC 
AGCTTTTTGAAGAGACAGTTTTAAAAGATGTTGCTTTTGGACCACAAAAT 
TTTGGTATTTCTCAGATTGAAGCTGAAAGGCTGGCTGAAGAAAAATTAAG 
G T T AGT T G GT AT C AG T GAGGAT T TAT T C GAT AAAAAT C CAT T T GAAC T T T 
CTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGCGATGGAA 
C C CAAAGT AC T AGT AC T AGAT G AGC C AAC AGCT GGAC T T GAT C C T AAGGG 
AAGAAAAGAAT T AAT G ACT CT TT TT AAAAAT CTT C ATAAAAAAGGAATGA 
CTATCGTCTTAGTGACTCACTTAATGGACGATGTAGCGGATTATGCTGAC 
TAT GT GT AT G T T T TAG AAGC AGG G AAAGT AAC C T TAT C AG G AC AAC C AAA 
G C AG AT T T T T C AAG AAGT AGAAC T T T TAG AAAGT AAAC AAT T AGG AG T T C 
CCAAAATCACCAAGTTTGCTCAAAGGCTATCTCATAAGGGATTAAATTTA 
C C T AGT T T AC C AAT TACT AT T AAC G AAT T T G T GGAGG C TAT T AAGC AT G G 
A 

SEQ ID NO. 7904 

STRAIN H3 6B 

GGAATTGAATTT AAAAAT GTAAGTT AT AC 

CTATCAAGCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATC 
TGAAAATTGAAGATGCTTCCTATACCGCGTTCATTGGGCACACAGGTTCT 
G G AAAAT C AACT AT TAT G C AACT T T T G AAT GGT T T AC AT AT T C C T AC AAA 
AGGTGAGGTAATTGTCGATGATTTTTCTATTAAAGCAGGGGACAAGAACA 
AAG AAAT C AAAT T T AT AAGG C AAAAAG T T GGT T TAG T T T T T C AAT T T C C A 
G AAAGT C AG CT T T T T G AAG AGAC AGT T T T AAAAG AT GTTGCTTTTG G AC C 
ACAAAATTTTGGTATTTCTCAGATTGAAGCTGAAAGGCTGGCTGAAGAAA 
AAT T AAGGT T AGT T GGT AT C AGT GAGGAT T T AT T CGAT AAAAAT C CAT T T 
GAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGC 
GAT G GAAC C C AAAG T AC T AGT AC TAG AT GAG C C AAC AG C T GGAC T T GAT C 
CT AAG G G AAG AAAAG AAT T AAT G AC T CT T T T T AAAAAT CTT C AT AAAAAA 
GGAATGACTATCGTCTTAGTGACTCACTTAATGGACGATGTAGCGGATTA 
T GC T G AC TAT GT GT AT GT T T T AGAAG C AGG GAAAGT AAC C T TAT C AGG AC 
AACCAAAGCAGATTTTTCAAGAAGTAGAACTTTTAGAAAGTAAACAATTA 
GGAGTTCCCAAAATCACCAAGTTTGCTCAAAGGCTATCTCATAAGGGATT 
AAATTTACCTAGTTTACCAATTACTATTAACGAATTTGTGGAGGCTATTA 
AGCATGGA 

SEQ ID NO. 7905 

STRAIN 18RS21 

GGAATTGAATTTAAAAATGTAAGTTATAC 

CTATCAAGCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATC 
TGAAAATTGAAGATGCTTCCTATACCGCGTTCATTGGGCACACAGGTTCT 
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GGAAAAT CAACTAT TAT GCAACTTTTGAATGGTTT ACAT ATT CCTACAAA 
AGGT GAGGT AAT T GT CG AT GAT T T T T CT AT T AAAG C AGG G G AC AAGAAC A 
AAGAAAT CAAATTT ATAAGGCAAAAAGTTGGTTT AGT TTTT CAATTT CCA 
GAAAGTCAGCTTTTTGAAGAGACAGTTTTAAAGGATGTTGCTTTTGGACC 
ACAAAATTTTGGTATTTCTCAGATTGAAGCTGAAAGGCTGGCTGAAGAAA 
AAT T AAGGT T AGT T GGT AT C AGT GAG GAT T TAT T CGAT AAAAAT CC AT T T 
GAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGC 
GAT G G AAC C C AAAGT ACT AGT AC T GGAT GAG CCAAC AG CT GGAC T T GAT C 
CT AAG GGAAG AAAAG AAT T AAT G AC T CT T T T T AAAAAT CT T C AT AAAAAA 
GGAATGACTATCGTCTTAGTGACTCACTTAATGGACGATGTAGCGGATTA 
T GC T G AC T AT GT G T AT GT T T TAG AAG C AGG G AAAGT AACCT TAT C AG G AC 
AACCAAAACAGATTTTTCAAGAAGTAGAACTTTTAGAAAGTAAACAATTA 
GG AGT T C C C AAAAT C AC C AAGT T T G CT C AAAG ACT AT C T CAT AAG G GAT T 
AAATTTACCTAGTTTACCAATTACTATTAACGAATTTGTGGAGGCTATTA 
AGCATGGA 

SEQ ID NO. 7906 

STRAIN M732 

GGAATTGAATT T AAAAAT GT AAGT TAT AC 

CTATCAAGCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATC 
TGAAAATTGAAGATGTTTCCTATACCGCGTTCATTGGGCACACAGGTTCT 
GGAAAAT C AAC TAT TAT G C AAC TTTT GAAT GGT T T AC AT AT T C C TAG AAA 
AG GT G AGG T AAT T G T CGAT GAT T T T T C TAT T AAAG C AG G G G AC AAG AAC A 
AAGAAATCAAATTT ATAAGGCAAAAAGTTGGTTT AGTTTTTCAATTTCCA 
GAAAGTCAGCTTTTTGAAGAGACAGTTTTAAAGGATGTTGCTTTTGGACC 
ACAAAATTTTGGTATTTCTCAGATTGAAGCTGAAAGGCTGGCTGAAGAAA 
AAT T AAGGT TAG T T G G T AT C AGT GAGG AT T TAT T C GAT AAAAAT C CAT T T 
GAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGC 
GAT G GAACC C AAAG TACT AG TACT G GAT GAG C C AAC AG C T G G ACT T GAT C 
C TAAGG G AAG AAAAG AAT T AAT G ACT CT T T T T AAAAAT C T T CAT AAAAAA 
G GAAT G ACT AT C GT C T TAG T G ACT C AC T T AAT GGAC GAT G TAG C GG AT T A 
T G CT G AC TAT GT GT AT GT T T TAG AAG C AGG G AAAG T AAC CT TAT C AG G AC 
AACCAAAACAGATTTTTCAAGAAGTAGAACTTTTAGAAAGTAAACAATTA 
GG AGT T C C C AAAAT C AC C AAG T T T G C T C AAAG AC TAT CT C AT AAGGG AT T 
AAAT T T AC C T AGT T T AC C AAT TAG T AT T AAC GAAT T T GT GG AGG C TAT T A 
AGCATGGA 

SEQ ID NO. 7907 

STRAIN COH1 

GG AAT T GAAT T T AAAAAT GT AAGT TAT AC CT AT CAAGCC 

GGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATCTGAAAATTGA 

AGATGTTTCCTATACCGCGTTCATTGGGCACACAGGTTCTGGAAAATCAA 

CTATTATGCAACTTTTGAATGGTTTACATATTCCTACAAAAGGTGAGGTA 

AT T G T C GAT GAT T T T T C TAT T AAAG C AGGG G AC AAG AAC AAAG AAAT C AA 

ATTTATAAGGCAAAAAGTTGGTTTAGTTTTTCAATTTCCAGAAAGTCAGC 

TTTTTGAAGAGACAGTTTTAAAGGATGTTGCTTTTGGACCACAAAATTTT 

GGTATTTCTCAGATTGAAGCTGAAAGGCTGGCTGAAGAAAAATTAAGGTT 

AGT T G GT AT C AGT GAG GAT T TAT T C GAT AAAAAT CC AT T T G AAC T T T C T G 

GAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGCGATGGAACCC 

AAAG T AC T AGT AC T G GAT GAG C C AAC AG C T GGAC T T GAT C CT AAGGG AAG 

AAAAG AAT T AAT G ACT CT T T T T AAAAAT C T T C AT AAAAAAG GAAT G AC T A 

TCGTCTTAGTGACTCACTTAATGGACGATGTAGCGGATTATGCTGACTAT 

GTGTATGTTTTAGAAGCAGGGAAAGTAACCTTATCAGGACAACCAAAACA 

GAT TTT T CAAGAAGT AGAACT TTT AGAAAGT AAAC AAT T AGGAGT T CC CA 

AAAT C ACCAAGTTT GCT C AAAGACT AT CT CAT AAGGGATT AAAT TT AC CT 

AGTTTACCAATTACTATTAACGAATTTGTGGAGGCTATTAAGCATGGA 

SEQ ID NO. 7908 

STRAIN M7 81 

GGAAT T GAAT T T AAAAAT GT AAGT TAT AC 

CTATCAAGCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATC 
T GAAAAT T G AAG AT G T T T C C TAT AC C G C GT T CAT T GG G C AC AC AGGT T C T 
GGAAAAT CAACT AT T AT GCAACT T T T G AATGGT TT ACAT AT T C CTACAAA 
AGGTGAGGTAATTGTCGATGATTTTTCTATTAAAGCAGGGGAC AAGAAC A 
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AAGAAATCAAATTTATAAGGCAAAAAGTTGGTTTAGTTTTTCAATTTCCA 
GAAAGT C AG C T T T T T G AAG AG AC AGT T T T AAAG G AT GT T GC T T T T GG ACC 
ACAAAATTTTGGTATTTCTCAGATTGAAGCTGAAAGGCTGGCTGAAGAAA 
AATTAAGGTTAGTTGGTATCAGTGAGGATTTATTCGATAAAAATCCATTT 
GAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGC 
GATGGAACCCAAAGTACTAGTACTGGATGAGCCAACAGCTGGACTTGATC 
CT AAGGGAAG AAAAGAAT T AAT GAG T CTTTTT AAAAAT CT T C AT AAAAAA 
GG AAT G ACT AT C GT CT T AGT G AC T C ACT T AAT GG ACG AT GT AGC GG AT T A 
T G CT G AC T AT G T G T AT GT T T TAG AAG CAGGG AAAG TAAC CT TAT C AGG AC 
AACCAAAACAGATTTTTCAAGAAGTAGAACTTTTAGAAAGTAAACAATTA 
GG AGT T C C C AAAAT C AC C AAGT T T G CT C AAAG AC TAT CT C AT AAGGG AT T 
AAAT T T AC C T AGT T T AC C AAT T AC TAT TAAC G AAT T T GT GG AGG C T AT T A 
AGCATGGA 

SEQ ID NO. 7909 

STRAIN CJB110 

G G AAT T GAAT T T AAAAAT GT AAG T TAT AC 

CTATCAAGCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTCAATC 
T G AAAAT T G AAG AT G CT T C C TAT AC CGC GT T CAT T G GG C AC AC AGGT T CT 
GGAAAATCAACTATTATGCAACTTTTGAATGGTTTACATATTCCTACAAA 
AGGTGAGGTAATTGTCGATGATTTTTCTATTAAAGCAGGGGACAAGAACA 
AAGAAATCAAATTTATAAGGCAAAAAGTTGGTTTAGTTTTTCAATTTCCA 
G AAAG T C AG CT T T T T G AAGAGAC AGT T T T AAAGGAT GTTGCTTT T GG AC C 
ACAAAATTTTGGTATTTCTCAGATTGAAGCTGAAAGGCTGGCTGAAGAAA 
AATT AAGGTTAGTTGGTATC AGT GAGG ATT T ATT CG AT AAAAAT CCATTT 
GAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTTTAGC 
GAT G GAAC C C AAAGT ACT AGT AC T GG AT GAG C C AAC AG CT GG AC T T GAT C 
CT AAGGGAAG AAAAG AATT AAT G ACT CTTTTT AAAAAT CTT CAT AAAAAA 
G GAAT GACT AT C GT C T T AGT G AC T C AC TT AAT G G ACG AT GT AGC G GAT T A 
TGCTGACTATGTGTATGTTTTAGAAGCAGGGAAAGTAACCTTATCAGGAC 
AAC C AAAAC AG AT T T T T C AAG AAGT AG AACT T T T AGAAAGT AAAC AAT T A 
GG AGT T C C C AAAAT C AC C AAG T T T G C T C AAAG ACT AT C T C AT AAG GGATT 
AAAT T T AC CT AG T T T AC C AAT TACT AT T AACG AAT T T G T GG AG G C TAT T A 
AGCATGGA 

SEQ ID NO. 7910 

STRAIN 1169NT 

GGAAT T GAAT T T AAAAAT GTAA 

GTTATACCTATCAAGCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGAC 
GT C AAT C T G AAAAT T G AAG AT GC T T C CT AT AC CG C G T T CAT T GG G C AC AC 
AGGT T CT GG AAAAT C AAC TAT TAT G C AAC T T T T GAAT G GT T T AC AT AT T C 
CT AC AAAAGGT G AGG T AAT T GT C GAT GAT T T T T CT AT T AAAG C AG G G G AC 
AAG AAC AAAG AAAT C AAAT T TAT AAG G C AAAAAGT T G G T T T AGT T T T T C A 
ATTTCCAGAAAGTCAGCTTTTTGAAGAGACAGTTTTAAAGGATGTTGCTT 
TTGGACCACAAAATTTTGGTATTTCTCAGATTGAAGCTGAAAGGCTGGCT 
G AAG AAAAAT T AAG G T T AGT T GG T AT C AG T GAGG AT T T AT T C G AT AAAAA 
TCCATTTGAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTA 
TTTTAGCGATGGAACCCAAAGTACTAGTACTGGATGAGCCAACAGCTGGA 
CTTGATCCTAAGGGAAGAAAAGAATTAATGACTCTTTTTAAAAATCTTCA 
TAAAAAAGGAATGACTATCGTCTTAGTGACTCACTTAATGGACGATGTAG 
CGGATTATGCTGACTATGTGTATGTTTTAGAAGCAGGGAAAGTAACCTTA 
T C AGG AC AAC C AAAAC AG AT T T T T C AAGAAGT AGAAC T T T TAG AAAG TAA 
AC AAT TAG GAGT T C C C AAAAT C AC C AAG T T T G CT C AAAG AC TAT C T CAT A 
AGGG AT T AAAT T T AC C T AGT T T AC C AAT TACT AT TAAC GAAT TT G T GG AG 
G C TAT T AAGC AT G G A 

SEQ ID NO. 7911 

STRAIN JM9130013 

GGAATT GAAT TT AAAAAT GT AAGT T 

ATACCTATCAAGCCGGCACTCCTTTTGAAGGGCGTGCCCTTTTTGACGTT 
AAT C T G AAAAT T G AAG AT G C T T C CT AT AC CG CAT T CAT T GG G C AC AC AGG 
TTCTGGAAAATCAACTATTATGCAACTTTTGAATGGTTTACATATTCCTA 
C AAAAGG T G AGGT AAT T GT C GAT GAT T T T T CT AT T AAAG C AGG G G AC AAG 
AAC AAAG AAAT C AAAT T TAT AAG G C AAAAAGT T GGT T T AGT T T T T C AAT T 
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T C C AG AAAGT C AGC T T T T T G AAG AG AC AGT T T T AAAG GAT GT TGCTTTTG 
G AC C AC AAAAT T T T GG T AT T T C T C AG AT T G AAGCT G AAAGGCT GG C T G AA 
GAAAAATTAAGGTTAGTTGGTATTAGTGAGGATTTATTCGATAAAAATCC 
ATTTGAACTTTCTGGAGGGCAGATGAGGCGGGTTGCTATAGCTGGTATTT 
TAGCGATGGAACCCAAAGTACTAGTACTGGATGAGCCAACAGCTGGACTT 
GAT C CT AAGG G AAG AAAAG AAT T AAT G AC T CT T T T T AAAAAT C T T C AT AA 
AAAAG GAAT G ACT AT C GT CT T AGT G ACT C AC T T AAT GG AC G AT GT AG CG G 
ATTATGCTGACTATGTGTATGTTTTAGAAGCAGGGAAAGTAACCTTATCA 
G GAC AAC C AAAAC AG AT T T T T C AAG AAGT AGAAC T T T TAG AAAG T AAAC A 
AT T AGG AG T T C C C AAAAT C AC C AAG T T T G C T C AAAGACT AT C T CAT AAG G 
G AT T AAAT T T AC C T AG T T T AC C AAT T AC TAT T AAC GAAT T T GT GG AGGC T 
ATTAAGCATGGA 

SEQ ID NO. 7912 
STRAIN 2 603 frame: 1 

MGIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
E R L AE E KLRL VG I S E D L FDKN P FE L S G GQMRRVA I AG I LAME P K VLVL DE P TAG L D PKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 7913 

STRAIN 090 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGILAME PKVLVLDEPTAGLDPKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINE FVEAIKHG 

SEQ ID NO. 7914 

STRAIN 090 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGILAMEPKVLVLDEPTAGLDPKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 7915 

STRAIN H36B frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGI LAME PKVLVLDEPTAGLDPKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINE FVEAIKHG 

SEQ ID NO. 7916 

STRAIN 18RS21 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGILAME PKVLVLDEPTAGLDPKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINE FVEAIKHG 

SEQ ID NO. 7 917 

STRAIN M7 32 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDVSYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGILAMEPKVLVLDEPTAGLDPKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 7918 

STRAIN COH1 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDVSYTAFIGHTGSGKSTIMQLLNGLHIPTK 
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GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVG I SE DLFDKN P FE LS GGQMRRVAI AG I LAME PKVLVLDE PT AGLD PKGR 
KE LMT L FKN LHKKGMT I VLVTHLMD D VAD YAD Y VYVLE AGKVT LSGQ PKQ I FQEVE LLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 7919 

STRAIN M7 81 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDVSYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGILAMEPKVLVLDEPTAGLDPKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 7920 

STRAIN CJB110 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGILAMEPKVLVLDEPTAGLDPKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGKVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 7921 

STRAIN 1169NT frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGILAMEPKVLVLDEPTAGLDPKGR 
KE LMT L FKNLHKKGMT I VLVTHLM DDVAD YAD Y VYVLE AGKVT L S GQ PKQ I FQEVE LLE S 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 7922 

STRAIN JM9130013 frame: 1 

GIEFKNVSYTYQAGTPFEGRALFDVNLKIEDASYTAFIGHTGSGKSTIMQLLNGLHIPTK 
GEVIVDDFSIKAGDKNKEIKFIRQKVGLVFQFPESQLFEETVLKDVAFGPQNFGISQIEA 
ERLAEEKLRLVGISEDLFDKNPFELSGGQMRRVAIAGILAME PKVLVLDE PTAGLDPKGR 
KELMTLFKNLHKKGMTIVLVTHLMDDVADYADYVYVLEAGECVTLSGQPKQIFQEVELLES 
KQLGVPKITKFAQRLSHKGLNLPSLPITINEFVEAIKHG 

SEQ ID NO. 8001 
STRAIN 2603 

GT G AAC C AC T T AC T T AAC CT C AGT AAAG AAAAT AT AG C T AAAAT AG AT T T T G ACT T T CT T 
AAT G AGG C ACT T AAT G C AAAT AT T C GT T T G AAAG AAT T AGT AGAT G AAC T AAAAAT T T C A 
AAAG AACT G GAC AG T AAAGGT T GGT C C AAAAAAG AC T C T CG AAC G AT AAAAAT C T T GT AC 
GAT G G C CT TAT C AAT AAAC AT AT AGT T T C C C TAG AT C GT G C AG AT TAT AAC AT TAT CC AA 
GTCATTCCATTTGCTAATGTACATGTACTACTGTTTTTAATACCAGAAAGGGAGAATTCT 
AAAAAT TAT AG AAT AT AC AACT AC AGT GAT TAT G AAAT G GAG T T AAT C AAT GAG GAT AGG 
C AAC AAT T T T C AAAAT AT G AAAC AG T T GAT T T AGAC C AAT T GAT ACT T GT T GAT AT T T T T 
AAT AT T GAT G ACT AC AT T T CAT CAT AT T T AAC AAT A 

SEQ ID NO. 8002 
STRAIN H3 6B 

AAC C AC T TAG T T AAC C T C AG T AAAG AAAAT AT AG C T 

AAAAT AGAT TTTGACTTTCTTAATGAGGCACTT AAT GCAAAT ATT CGTTT 
G AAAG AAT T AGT AG AT G AAC T AAAAAT T T C AAAAG AAC T GG AC AGT AAAG 
GTT GGT CCAAAAAAGACTCTCGAAC GAT AAAAAT CTTGTACGATGGCCTT 
AT C AAT AAAC AT AT AG T T T C C C T AGAT C GT G C AG AT TAT AAC AT TAT C C A 
AGTCATTCCATTTGCTAATGTACATGTACTACTGTTTTTAATACCAGAAA 
GG GAG AAT T CT AAAAAT TAT Ag AAT AT AC AAC T AC AGT GAT TAT G AAAT G 
G AGT T AAT C AATGAGG AT AGGC AAC AATT T T C AAAAT AT GAAAC AGTT G A 
T T TAG AC C AAT T GAT AC T T GT T GAT AT T T T T AAT AT T GAT G AC T AC AT T T 
CATCATATTTAACAATA 

SEQ ID NO. 8003 
STRAIN 18RS21 

AACCACTTACTTAACCTCAGTAAAGAAAATATAG 
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CTAAAATAGATTTTGACTTTCTTAATGAGGCACTTAATGCAAATATTCGT 
T T GAAAG AAT T AGT AG AT GAAC T AAAAAT T T C AAAAG AACT GG AC AGT AA 
AGGT T G GT C C AAAAAAG AC T CT C GAACG AT AAAAAT CT T GT AC G AT GG C C 
T TAT C AAT AAAC AT AT AG T T T C C CT AG AT C GT G C AGAT T AT AAC AT T AT C 
C AAGT C AT T C CAT T T G C T AAT GT AC AT GT AC TACT GT T TT T AAT AC C AGA 
AAGGGAGAATT CT AAAAATTAT AGAATATACAACTACAGTGATT AT GAAA 
T GGAGTT AAT C AAT GAGGAT AGG C AAC AAT T T T C AAAAT AT GAAAC AGT T 
GATTTAGACCAATTGATACTTGTTGATATTTTTAATATTGATGACTACAT 
T T CAT CAT AT T T AAC AAT A 

SEQ ID NO. 8004 
STRAIN 2 603 frame: 1 

VNHLLNLSKENIAKIDFDFLNEALNANIRLKELVDELKISKELDSKGWSKKDSRTIKILY 
DGLINKHIVSLDRADYNIIQVIPFANVHVLLFLIPERENSBCNYRIYNYSDYEMELINEDR 
QQFSKYETVDLDQLILVDIFNIDDYISSYLTI 

SEQ ID NO. 8005 

STRAIN H36B frame: 1 

NHLLNLSKENIAKIDFDFLNEALNANIRLKELVDELKISKELDSKGWSKKDSRTIKILYD 
GLINKHIVSLDRADYNIIQVIPFANVHVLLFLIPERENSKNYRIYNYSDYEMELINEDRQ 
QFSKYETVDLDQLILVDIFNIDDYISSYLTI 

SEQ ID NO. 8006 

STRAIN 18RS21 frame: 1 

NHLLNLSKENIAKIDFDFLNEALNANIRLKELVDELKISKELDSKGWSKKDSRTIKILYD 
GLINKHIVSLDRADYNIIQVIPFANVHVLLFLIPERENSKNYRIYNYSDYEMELINEDRQ 
QFSKYETVDLDQLILVDIFNIDDYISSYLTI 

SEQ ID NO. 8101 
STRAIN 090 

AGCAAGCCTAATGTTGTTCAGTTAAA 

T AAT CAAT AT ATT AACGATG AGAAT CT AAAAAAACGT T ACGAAGCT GAGG 
AGTTACGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTCATG 
CTTTTATTTATTTTACCCACTTATAATTTAGTTAAGAGTTACAGAACTTT 
AC AAGAACGTCGT C AAG AAG T T GT AAAAT T AAC GAAAG AC TAT CAGACAT 
T AAC T AAT AG AAC T G AGAAC C AG AAGT T G CT AG C AAAAC AAC T AAAAAAT 
C C AG AT T AC G T T C AAAAAT AT G C T C GAG C T AAG TAT TAT T T C T C T AAG AC 
C GG C G AAAT GAT T T AC C C AT T AC C AG AC C T T T T AC C AAAA 

SEQ ID NO. 8102 

STRAIN A909 

AGCAAGCCTAATGTTGTTCAGTTAAATAATCAATA 

T AT T AAC GAT GAG AAT C T AAAAAAAC G T T ACG AAG C T GAGG AGT TACG C C G AAAAAAT C G 
TTTAATGGGTTGGGTTCTTATTTTTGTCATGCTtttATTTATTTTACCCACTTATAATTT 
AGT T AAG AG T T AC AG AAC T T T AC AAG AAC GT CGT C AAGAAGT T GT AAAAT T AAC GAAAG A 
C T AT C AG AC AT T AAC T AAT AG AAC T GAG AAC C AG AAGT TACT AG C AAAAC AACT AAAAAA 
TCC AG ATTAC GT TC AAAAAT AT G C TC GAG C T AAG T ATT AT TTCTCT AAG ACCGGCG AAAT 
GAT T T AC C C AT TAG C AG ACC T 

SEQ ID NO. 8103 
STRAIN H3 6B 

AGCAAGCCTAATGTTGTTCAGTTAAA 

T AAT CAAT AT AT T AAC GAT GAG AAT C T AAAAAAAC GT T AC G AAG C T GAG G 
AGTTACGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTCATG 
CTTTTATTTATTTTACCCACTTATAATTTAGTTAAGAGTTACAGAACTTT 
AC AAG AAC GT CGT C AAG AAG T T G T AAAAT T AAC GAAAG AC TAT CAGACAT 
T AACT AAT AGAACT GAGAAC CAGAAGTT ACT AGCAAAACAACT AAAAAAT 
C C AG AT T AC G T T C AAAAAT AT G C T C G AG CT AAG TAT TAT T T C T C T AAG AC 
C G G C G AAAT GAT T T AC C CAT TAG C AG AC C t T T T AC C AAAA 

SEQ ID NO. 8104 

STRAIN 18RS21 

AGC AAGC CT AAT GTT GT T C AGT T AAAT AAT C AAT AT AT T AACGATGAGAAT CT AAAAAAA 
CGTTACGAAGCTGAGGAGTTACGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTT 
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GT CAT G C T T T TAT T TAT T T TAG C C AC T T AT AAT T TAG T T AAG AGT T AC AG AAC T T T AC AA 
GAACGT CGT CAAGAAGTTGTAAAATT AACGAAAG ACTAT C AGACATTAACT AATAGAACT 
GAG AAC C AG AAGT T G CT AGC AAAAC AAC T AAAAAAT C C AGAT T AC G T T C AAAAAT AT G C T 
CGAGCTAAGTATTATTTCTCTAAGACCGGCGAAATGATTTACCCATTACCAGACCTTTTA 
CCAAAA 

SEQ ID NO. 8105 
STRAIN M732 

AG C AAG C CT AAT GT T GT T C AGT T AAA 

T AAT C AAT AT AT T AAC GAT G AGAAT C T AAAAAAAC G T T ACGAAGC T G AGG 
AGTTACGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTCATG 
C T T T TAT T TAT T T T AC C C AC T TAT AAT T T AGT T AAG AGT T AC AG AAC T T T 
AC AAG AAC G T C GT C AAGAAGT T GT AAAAT T AAC GAAAGACT AT C AG AC AT 
T AAC T AAT AG AAC T GAG AAC C AG AAGT T AC TAG C AAAAC AAC T AAAAAAT 
CCAGATTACGTTCAAAAATATGCTCGAGCGAAGTATTATTTCTCTAAGAC 
CG G C G AAAT GAT T TAG C C AT T AC C AGAC C t T T T AC C AAAA 

SEQ ID NO. 8106 
STRAIN COH1 

AG C AAG C C T AAT GT T GT T C AGT T AAAT AAT C 

AAT AT AT T AAC GAT GAG AAT C T AAAAAAAC G T T ACGAAGC T GAG GAGT T A 
CGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTCATGCTttt 
AT T T AT T T TAG C C ACT TAT AAT T T AGT T AAGAGT T AC AG AACT T T AC AAG 
AAC G T C GT C AAG AAGT T GT AAAAT T AAC GAAAGACT AT C AG AC AT T AAC T 
AAT AGAAC T GAG AAC C AG AAGT TAG TAG C AAAAC AAC T AAAAAAT C C AG A 
TTACGTTCAAAAATATGCTCGAGCGAAGTATTATTTCTCTAAGACCGGCG 
AAATGATT T AC CC AT T ACC AGACCTT TT AC C AAAA 

SEQ ID NO. 8107 
STRAIN M7 81 

AGCaAGCCTAATGTTGTTCAGTT 

AAATAATCAATATaTTAACGATGAGAATCTAAAAAAACGTTACGAAGCTG 
AGGAGTTACGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTC 
ATGCTTTTATTTATTTTACCCACTTATAATTTAGTTAAGAGTTACAGAAC 
T T T AC AAGAAC GT CGT C AAG AAGT T G T AAAAT T AAC G AAAGAC T AT C AG A 
C AT T AAC T AAT AGAAC T G AGAAC C AGAAGT T AC T AGC AAAAC AACT AAAA 
AATCCAGATTACGTTCAAAAATATGCTCGAGCGAAGTATTATTTCTCTAA 
GAC C GG C G AAAT GAT T T AC C CAT TAG C AG AC C t T T T ACC AAAA 

SEQ ID NO. 8108 
STRAIN CJB110 

AG C AAG C C T AAT G T T GT T C AGT T AAAT AAT C 

AAT AT AT T AAC GAT G AGAAT C T AAAAAAAC GT T AC G AAG C T GAG GAG T T A 
CGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTCATGCTttt 
AT T T AT T T T AC C C AC T TAT AAT T TAG T T AAG AGT T AC AG AAC T T T AC AAG 
AAC G T C GT C AAG AAGT T GT AAAAT T AAC G AAAGAC TAT C AG AC AT T AAC T 
AAT AG AAC T GAG AAC C AGAAGT T G C T AG C AAAAC AAC T AAAAAAT C C AG A 
TTACGTTCAAAAATATGCTCGAGCTAAGTATTATTTCTCTAAGACCGGCG 
AAAT GAT T T AC C CAT T AC C AG AC C t T T T AC C AAAA 

SEQ ID NO. 8109 
STRAIN 1169NT 

AGCAAGCCTAATGTTGTTCAGTT AAA 

T AAT C AAT AT AT T AAC GAT GAGAAT C T AAAAAAAC G T TAG G AAG C T GAG G 
AGTTACGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTCATG 
C T T T TAT T TAT T T T AC C C ACT TAT AAT T TAG T T AAG AGT T AC AG AAC T T T 
AC AAGAAC G T C GT C AAG AAGT T GT AAAAT T AAC G AAAGAC TAT C AGAC AT 
T AAC T AAT AG AAC T GAG AAC C AG AAGT TAG TAG C AAAAC AAC T AAAAAAT 
C C AG AT T AC GT T C AAAAAT AT G CT C GAG CT AAG TAT TAT T T C T C T AAG AC 
C G G C G AAAT GAT T T AC C CAT T AC C AG AC C t T T TAG C AAAA 

SEQ ID NO. 8110 
STRAIN JM9130013 
AGCaAGCCTAATGTTGTTCAGTT AAA 
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T AAT C AAT AT AT T AAC GAT GAG AAT C T AAAAAAAC GT T AC G AAG C T G AGG 
AGTTACGCCGAAAAAATCGTTTAATGGGTTGGGTTCTTATTTTTGTCATG 
C T T T TAT T T AT T T T AC C C ACT TAT AAT T T AGT T AAG AGT T AC AG AACT T T 
AC AAGAACG T CGT C AAG AAGT T GT AAAAT T AAC G AAAG AC TAT C AGAC AT 
TAACTAATAGAACTGAGAACCAGAAGTTACTAGCAAAACAACTAAAAAAT 
C C AG AT T ACGT T C AAAAA.T AT G CT C G AGC GAAGT AT TAT T T CT C T AAG AC 
T G G CG AAAT GAT T T AC C CAT T AC C AGAC C t T T T ACC AAAA 

SEQ ID NO. 8111 
STRAIN 2 603 

agcaagcctaatgttgttcagttaaataatcaatatattaacgatgagaa 
tctaaaaaaacgttacgaagctgaggagttacgccgaaaaaatcgtttaa 
tgggttgggttcttatttttgtcatgcttttatttattttacccacttat 
aatttagttaagagttacagaactttacaagaacgtcgtcaagaagttgt 
aaaattaacgaaagactatcagacattaactaatagaactgagaaccaga 
agttgctagcaaaacaactaaaaaatccagattacgttcaaaaatatgct 
cgagctaagtattatttctctaagaccggcgaaatgatttacccattacc 
agaccttttaccaaaa 

SEQ ID NO. 8112 
STRAIN 0 90 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNL 

VKSYRTLQERRQEWKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEM 

IYPLPDLLPK 

SEQ ID NO. 8113 

STRAIN A909 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNL 

VKSYRTLQERRQEVVKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEM 

IYPLPD 

SEQ ID NO. 8114 

STRAIN H3 6B 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNL 

VKSYRTLQERRQEVVKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEM 

IYPLPDLLPK 

SEQ ID NO. 8115 

STRAIN 18RS21 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNLVKSYRTLQ 
ERRQEWKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEMIYPLPDLL 
PK 

SEQ ID NO. 8116 

STRAIN M732 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNL 

VKSYRTLQERRQEWKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEM 

IYPLPDLLPK 

SEQ ID NO. 8117 

STRAIN COH1 

SKPNVVQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNLVK 

SYRTLQERRQEVVKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEMIY 

PLPDLLPK 

SEQ ID NO. 8118 

STRAIN M781 

SKPNVVQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYN 

LVKSYRTLQERRQEVVKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGE 

MIYPLPDLLPK 

SEQ ID NO. 8119 

STRAIN CJB110 

SKPNVVQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNLVK 
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SYRTLQERRQEWKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEMIY 
PLPDLLPK 

SEQ ID NO. 8120 

STRAIN 1169NT 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNL 

VKSYRTLQERRQEWKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEM 

IYPLPDLLPK 

SEQ ID NO. 8121 

STRAIN JM9130013 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNL 

VKSYRTLQERRQEWKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEM 

IYPLPDLLPK 

SEQ ID NO. 8122 

STRAIN 2603 

SKPNWQLNNQYINDENLKKRYEAEELRRKNRLMGWVLIFVMLLFILPTYNLVKSYRTLQ 
ERRQEWKLTKDYQTLTNRTENQKLLAKQLKNPDYVQKYARAKYYFSKTGEMIYPLPDLL 
PK 

SEQ ID NO. 8201 
STRAIN 2603 

ATGAAAAATTTATTGTTAAAATGTAAGGATAAGAAGGTTAAAGCATTTACACTTTTAGAA 
TGTTTGGTAGCATTGGTTACAATCACAGGAGCTTTACTAGTTTATCAAGGACTGACAAAA 
TTGTTGGCTCAACAGATAGTAGTGATGTCTTCTTCCAGTCAGTCTGAATGGGTGTTATTA 
Ac T C AG C AAC T AAAT G C AG AAT T T G AAGG C G C T CAT C T GGAAT AT T T AAGAC AG AAC AAA 
CTTTATTTACGTAAGCAAGATAAGATTGTAACCTTTGGCAAATCTAATAAAGATGATTTC 
CGTAAGACAGGTTATGATGGTCGAGGTTATCAACCAATGGTTTATGGGTTAGACAATTGT 
CAAATGAGTCAGACCAAAAGTATGGTAAAACTTGTTTTTTATTTTAAGGACGGGTTAAAA 
AG G AC AT T T TACT AT GAT T T T AAAG AAG AAAC T T AA 

SEQ ID NO. 8202 

STRAIN 0 90 

AAT T CG AAGGC G C T C ACT T GGAAT AT T T AAGAC AG AAC AAAC T T T AT T T A 
CG T AAGC AAG AT AAGAT T GT AAC C T T T GG C AAAT C T AAT AAAGAT GAT T T 
CCGTAAGACAGGTTATGATGGTCGAGGTTATCAACCAATGGTTTATGGGT 
TAGACAATTGTCAAATGAGTCAAACCAAAAGTATGGTAAAACTTGTTTTT 
TAT T T T AAGG ACG GGT T AAAAAGG AC AT T T TACT AT GAT T T T AAAG AAG A 
AACT 

SEQ ID NO. 8203 

STRAIN A909 

C AG AAT T T G AAGG C G CT CAT C T G G AAT ATT T AAG AC AG AAC AAAC T T T AT 
TTACGTAAGCAAGATAAGATTGTAACCTTTGGCAAATCTAATAAAGATGA 
T T T C C GT AAGAC AGGT TAT GAT GGT CGAG G T TAT C AAC C AAT G GT T TAT G 
GGTTAGACAATTGTCAAATGAGTCAGACCAAAAGTATGGTAAAACTTGTT 
T T T T AT T T T AAGG ACG GG T T AAAAAGG AC AT T T T AC T AT GAT T T T AAAG A 
AGAAACT 

SEQ ID NO. 8204 

STRAIN H3 6B 

AT GC AGAATT T GAAGGCGCT CAT CT GGAAT AT TT AAGAC AG AAC AAACTT 
TAT T T AC GT AAG C AAG AT AAG AT T GT AAC C T T T GG C AAAT C T AAT AAAG A 
TGATTTCCGTAAGACAGGTTATGATGGTCGAGGTTATCAACCAATGGTTT 
AT G G G T T AGAC AAT T GT C AAAT GAG T C AG AC C AAAAGT AT G GT AAAAC T T 
GT T T T T TAT T T T AAG GAC G GGT T AAAAAG G AC AT T T T ACT AT GAT T T T AA 
AGAAGAAACT 

SEQ ID NO. 8205 

STRAIN 18RS21 

AGAAT T TGAAGGCGCT CAT CT GGAAT ATT T AAGAC AGAACAAACTT T ATT 
TACGT AAGC AAGAT AAGATT GTAACCT T T GGC AAAT CT AAT AAAGAT GAT 
TTCCGTAAGACAGGTTATGATGGTCGAGGTTATCAACCAATGGTTTATGG 
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GTTAGACAATTGTCAAATGAGTCAGACCAAAAGTATGGTAAAACTTGTTT 
T T TAT T T T AAGG AC G GGT T AAAAAG GAG AT T T T AC TAT GAT T T T AAAG AA 
GAAACT 

SEQ ID NO. 8206 

STRAIN M732 

CAGAATTCGAAGGCGCT CACTTGGAAT ATTTAAGACAGAAC AAACT TT AT 
T T ACGT AAG C AAG AT AAGAT T GT AAC CT T T G G C AAAT CT AAT AAAG AT GA 
TTT CCGT AAGACAGGT TATAAT GGT CGAGGTT AT C AAC CAAT GGTTTATG 
GGT TAG AC AAT TGT C AAAT G AGT C AG AC C AAAAGT AT GGT AAAAC T T G TT 
T T T T AT T T T AAGGAC GG GT T AAAAAG GAC AT T T TACT AT GAT T T T AAAG A 
AGAAACT 

SEQ ID NO. 8207 

STRAIN COH1 

G AAT T CG AAGG CG C T C AC T T GG AAT AT T T AAG AC AG AAC AAACT T TAT T T 
ACGT AAGCAAGAT AAGAT TGT AACCTTT GGCAAAT CT AAT AAAGAT GAT T 
TCCGTAAGACAGGTTATAATGGTCGAGGTTATCAACCAATGGTTTATGGG 
T TAG AC AAT T GT C AAAT G AGT C AG AC C AAAAGT AT GGT AAAAC T T GT T T T 
T T AT T T T AAGGAC GGGT T AAAAAG GAC AT T T T AC TAT GAT T T T AAAGAAG 
AAACT 

SEQ ID NO. 8208 

STRAIN M7 81 

AGAATTCGAAGGCGCT CACT T GGAAT AT TTAAGACAGAAC AAACT TT ATT 
T ACGT AAGCAAGAT AAGAT T GTAACCT TTGGCAAAT CTAAT AAAGAT GAT 
T T C CGT AAGAC AG GT T AT AAT G GT C GAG G T T AT C AAC CAAT GGT T TAT G G 
GTTAGAC AAT TGT CAAAT GAGT CAGAC C AAAAGT AT GGT AAAACT TGTT T 
T T TAT T T T AAG GACGGG T T AAAAAG GAC AT T TT AC T AT GAT T T T AAAG AA 
GAAACT 

SEQ ID NO. 8209 

STRAIN CJB110 

GAATTCGAAGGCGCTCACTTGGAATATTTAAGACAGAACAAACTTTATTT 
ACGTAAGCAAGAT AAGAT TGT AACCT T TGGCAAAT CTAAT AAAGATGATT 
TCCGTAAGACAGGTTATGATGGTCGAGGTTATCAACCAATGGTTTATGGG 
T T AGAC AAT T GT CAAAT GAGT C AAAC C AAAAGT AT G GT AAAAC T T GT T T T 
T TAT T T T AAGG AC GGGT T AAAAAG GAC AT T T TACT AT GAT T T T AAAG AAG 
AAACT 

SEQ ID NO. 8210 

STRAIN 1169NT 

TCGAAGGCGCTCACTTGGAATATTTAAGACAGAACAAACTTTATTTACGT 
AAGCAAGAT AAG AT T GT AAC C T T T G G CAAAT C T AAT AAAG AT GAT T T T C G 
T AAG AC AG GT TAT GAT G GT C G AGGT T AT C AAC CAAT GGT T TAT GGGT TAG 
AC AAT TGT CAAAT GAGT C AAAC C AAAAGT AT G G T AAAACT T G T T T T T T AT 
T T T AAGG AC GG GT T AAAAAG GAC AT T T T AC TAT GAT T T T AAAG AAG AAAC 
T 

SEQ ID NO. 8211 

STRAIN JM9130013 

T G C AG AAT T T G AAG G C G C T CAT C T G G AAT AT T T AAG AC AG AAC AAAC TTT 
AT T T AC GT AAG C AAG AT AAG AT TGT AAC C T T T GG CAAAT C T AAT AAAG AT 
GAT T T CC G T AAG AC AG GT T AT GAT G GT C G AGGT TAT C AAC CAAT G GT T T A 
TGGGTTAGACAATTGTCAAATGAGTCAGACCAAAAGTATGGTAAAACTTG 
TTTTTTATTTTAAGGACGGGTTAAAAAGGACATTTTACTATGATTTTAAA 
GAAGAAACT 

SEQ ID NO. 8212 

STRAIN 2 603 frame: 1 

MKNLLLKCKDKICVKAFTLLECLVALVT ITGALLVYQGLTKLLAQQI WMS S S S QSEWVLL 
TQQLNAEFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYDGRGYQPMVYGLDNC 
QMSQTKSMVKLVFYFKDGLKRTFYYDFKEET. 
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SEQ ID NO. 8213 

STRAIN 090 frame: 3 

FEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYDGRGYQPMVYGLDNCQMSQTKS 
MVKLVFYFKDGLKRTFYYDFKEET 

SEQ ID NO. 8214 

STRAIN A909 frame: 3 

EFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYDGRGYQPMVYGLDNCQMSQTK 
SMVKLVFYFKDGLKRT FYYDFKEET 

SEQ ID NO. 8215 

STRAIN H3 6B frame: 3 

AEFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYDGRGYQPMVYGLDNCQMSQT 
KSMVKLVFYFKDGLKRT FYYDFKEET 

SEQ ID NO. 8216 

STRAIN 18RS21 frame: 2 

EFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYDGRGYQPMVYGLDNCQMSQTK 
SMVKLVFYFKDGLKRT FYYDFKEET 

SEQ ID NO. 8217 

STRAIN M732 frame: 3 

EFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYNGRGYQPMVYGLDNCQMSQTK 
SMVKLVFY FKDGLKRT FYYD FKEET 

SEQ ID NO. 8218 

STRAIN COH1 frame: 1 

EFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYNGRGYQPMVYGLDNCQMSQTK 
SMVKLVFY FKDGLKRT FYYD FKEET 

SEQ ID NO. 8219 

STRAIN M7 81 frame: 2 

EFEGAHLEYLRQNKLYLRKQDKIVT FGKSNKDDFRKTGYNGRGYQPMVYGLDNCQMSQTK 
SMVKLVFYFKDGLKRT FY YD FKE ET 

SEQ ID NO. 8220 

STRAIN CJB110 frame: 1 

EFEGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYDGRGYQPMVYGLDNCQMSQTK 
SMVKLVFYFKDGLKRT FYYD FKEET 

SEQ ID NO. 8221 

STRAIN 1169NT frame: 3 

EGAHLEYLRQNKLYLRKQDKIVTFGKSNKDDFRKTGYDGRGYQPMVYGLDNCQMSQTKSM 
VKLVFYFKDGLKRT FYYD FKEET 

SEQ ID NO. 8222 

STRAIN JM9130013 frame: 2 

AEFEGAHLEYLRQNKLYLRKQDKIVT FGKSNKDDFRKTGYDGRGYQPMVYGLDNCQMSQT 
KSMVKLVFYFKDGLKRT FYYDFKEET 

SEQ ID NO* 8301 
STRAIN 2603 

atgaaaaagattcgattatcaaagtttattaaaatgattgttgttattttgtttttaatt 
agtgtagcagctagtttttattttttccacgttgcccaagttcgagatgataaatccttt 
atttcaaatggtcaacgtaagcctggaaactctttatatgcttatgataaatcctttgat 
aagctattaaagcaaaaaatagaaatgacaaaccaaaatataaagcaagttgcttggtat 
gttcctgctgttaagaaaactcataagacagctgttgtcgttcatggttttgcgaatagc 
aaagagaatatgaaggcatatggttggctgtttcataagttaggatacaatgttcttatg 
cctgacaatattgcacatggtgaaagtcatgggcagttgataggctatggctggaacgac 
cgcgagaacattatcaaatggacagaaatgatagttgataagaatccatcaagccaaatt 
actttatttggtgtttcaatgggtggagcaacagtcatgatggctagtggtgaaaaatta 
cctagtcaggttgttaatatcattgaagattgcggttattctagtgtttgggatgaatta 
aaatttcaggctaaagagatgtatggtttaccagccttcccactcttatatgaagtttca 
acaatttctaaaatcagagcaggtttttcgtatggacaagcaagtagtgtcgaacaattg 
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aaaaagaataatttaccagccctctttattcatggtgataaggataattttgttccaaca 
agtatggtttatgacaactataaagctacagcaggtaagaaagagctttatattgtaaaa 
ggggcaaaacatgcgaaatcttttgaaacagagccagaaaaatatgagaaacgtatctct 
agttttttgaaaaaatatgaaaaa 

SEQ ID NO. 8302 
STRAIN 090 

GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTCG 

AGATGATAAATCCTTTATTTCAAATGGTCAACGTAAGCCTGGAAACTCTT 
TAT AT GC T TAT G AT AAAT C C T T T G AT AAGCT AT T AAAGC AAAAAAT AG AA 
ATGACAAACCAAAATATAAAGCAAGTTGCTTGGTATGTTCCTGCTGCTAA 
GAAAACT C AT AAG AC AG CTGTTGTCGTT CAT GGTTTTGC G AAT Ag C AAAG 
AG AAT AT GAAG G CAT AT GGTTGGCTGTTT C AT AAGT T AGG AT AC AAT GT T 
c T T AT G C C T G AC AAT AT T G C AC AT GG t G AAAGT CAT G GG C AG T T GAT AG G 
C TAT GG C T G GAACG AC C GC GAG AAC AT TAT C a AAT G G AC AGAAAT GAT AG 
TTGATAAGAATCCATCAAGCCAAATTACTTtaTTTGGTGTTTCAATGGGT 
GGAGCAACAGTCATGATGGCTAGTGGTGAAAAATTACCTAGTCAGGTTGT 
TAATATCATTGAAGATTGCGGTTATTCTAGTGTTTGGGATGAATTAAAAT 
TTCAGGCTAAAGAGATGTATGGTTTACCAGCCTTCCCACTCTTATATGAA 
GT T T C AAC AAT T T C T AAAAT C AG AG C AGG T T T T T C GT AT G G AC AAG C AAG 
T AGT G T CG AAC AAT T G AAAAAGAAT AAT T T AC C AG C C C T C T T T AT T CAT G 
GTGATAAGGATAATTTTGTTCCAACAAGTATGGTTTATGACAACTATAAA 
GCTACAGCAGGTAAGAAAGAGCTTTATATTGTAAAAGGGGCAAAACATGC 
G AAAT C T T T T G AAAC AG AG C C AG AAAAAT AT GAG AAAC GT AT C T C T AGT T 
T T T T G AAAAAAT AT GAAAAA 

SEQ ID NO. 8303 
STRAIN A909 

AATCCTTTATTTCAAATGGTCAACGTAAGCCTGGAAACTCTTTATATGCT 
TAT GAT AAAT C C T T T GAT AAG C T AT T AAAG C AAAAAAT AG AAAT G AC AAA 
CCAAAATATAAAGCAAGTTGCTTGGTATGTTCCTGCTGCTAAGAAAACTC 
ATAAGACAGCTGTTGTCGTTCATGGTTTTGCGAATAGCAAAGAGAATATG 
AAGGCATATGGTTGGCTGTTTCATAAGTTAGGATACAATGTTCTTATGCC 
T G AC AAC AT T G C AC AT GG T G AAAGT CAT G G G C AGT T GAT AG G CT AT G G CT 
GG AAC G AC C G C G AGAAC AT T AT C AAAT GG AC AG AAAT GAT AGT T GAT AAG 
AATTCATCAAGCCAAATTACTTTATTTGGTGTTTCAATGGGTGGAGCAAC 
AGT CAT GAT G G C T AGT G G T G AAAAAT T AC CT AGT C AG GT T GT T AAT AT C A 
T T GAAG At T G C G GT TAT TCTGGTGTTTGG GAT GAAT T AAAAT T T C AG G CT 
AAAG AG AT GT ATGG TT T AC C AG C C T T C C C AC T C T TAT AT G AAGT T T C AAC 
AAT T T CT AAAAT C AG AG C AG GT T T T T C G TAT GG AC AAg C AAG T AGT GT C G 
AAC AAT T GAAAAAG AAT AAT T T AC C AGC C C T CT T TAT T CAT G GT GAT AAG 
GAT AAT T T T G T T C C AAC Aa G T AT G GT T T AT G AC AAC TAT AAAG C T AC AG C 
AGGT AAGAAAG AGC T T TAT AT T GT AAAAGGG G C AAAAC AT G C G AAAT CT T 
T T G AAa C AG AG C C AG AAAAAT AT GAG AAAC G T AT CT C T AGT T T T T T G AAA 
AAAT AT GAAAAA 

SEQ ID NO. 8304 

STRAIN H36B 

AGTTTTTATTTTTTCCACGTTGCCCAAGTTCGAGATGATAAATCCTTTAT 
T T C AAAT GGT C AAC GT AAG C C T G GAAAC T CT T TAT AT G C T T AT GAT AAAT 
C CT TTG ATAAGCT ATT AAAGC AAAAAAT AGAAAT G AC AAAC C AAAAT AT A 
AAG C AAGTTGCTTGGTATGTTCCTGCTGCT AAG AAAAC T CAT AAG AC AGC 
TGTTGTCGTT CAT GGTTTTGC GAAT AG C AAAG AG AAT AT GAAG G CAT AT G 
GTTGGCTGTTTCATAAGTTAGGATACAATGTTcTTATGCCTGACAACATT 
G C AC AT GGT G AAAGT CAT G G G C AGT T GAT AGG CT AT G G C T GG AAC G AC C G 
C GAG AAC AT TAT C AAAT GG AC AG AAAT GAT AG T T GAT AAG AAT T CAT C AA 
GCCAAATTACTTTATTTGGTGTTTCAATGGGTGGAGCAACAGTCATGATG 
G C T AGT G G T G AAAAAT T AC C TAG T C AG GT T GT T AAT AT CAT T GAAG AT T G 
C G GT TAT TCtGGTGTT T GGG AT GAAT T AAAAT T T C AG G CT AAAG AG AT GT 
ATGGTTTACCAGCCTTCCCACTCTTATATGAAGTTTCAACAATTTCTAAA 
AT C AGAG C AG GT T T T T C GT AT G G AC AAg C AAGT AG T G T CG AAC AAT T G AA 
AAAGAATAATTTACCAGCCCTCTTTATTCATGGTGATAAGGATAATTTTG 
T T C C AAC AAGT AT GGTT T AT G AC AAC TAT AAAG CT AC AG C AGGT AAG AAA 
GAG C T T TAT AT T GT AAAAG G GG C AAAAC AT G C G AAAT C T T T T GAAAC AGA 
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GCC AG AAAAAT AT GAG AAAC G T AT CT CT AGT T T T T T G AAAAAa T AT g AAA. 
AA 

SEQ ID NO. 8305 

STRAIN 18RS21 

GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTCGA 

GAT GAT AAAT C CT TT AT T T C AAAT GGT C AAC GT AAG C CT GG AAAC T CT T T 
AT AT G CT T AT G AT AAAT C CT T T GAT AAGCT AT T AAAG C AAAAAAT AGAAA 
TGACAAACCAAAATATAAAGCAAGTTGCTTGGTATGTTCCTGCTGTTAAG 
AAAACTCATAAGACAGCTGTTGTCGTTCATGGTTTTGCGAATAGCAAAGA 
GAAT AT G AAGG CAT AT GGT T G G C T GT T T C AT AAGT T AGG AT AC AAT GT T C 
TTATGCCTGACAATATTGCACATGGTGAAAGTCATGGGCAGTTGATAGGC 
T ATGGCTGGAACGACCGCGAGAACAT TAT CAAATGGACAGAAAT GAT AGT 
TGATAAGAATCCATCAAGCCAAATTACTTTATTTGGTGTTTCAATGGGTG 
GAGCAACAGTCATGATGGCTAGTGGTGAAAAATTACCTAGTCAGGTTGTT 
AATATCATTGAAGATTGCGGTTATTcTAGTGTTTGGGATgAATTAAAATT 
TCAGGCTAAAGAGATGTATGGTTTACCAGCCTTCCCACTCTTATATGAAG 
T TT C AAC AAT T T CT AAAAT C AGAG C AG GT T TT T C GT AT G G AC AAg C AAGT 
AG T GT C GAAC AAT T GAAAAAG AAT AAT T T AC C AG C C CT C T T TAT T CAT GG 
T GAT AAGG AT AAT T T T G T T C C AAC AAG TAT G G T T TAT G AC AAC TAT AAAG 
CTACAGCAGGT AAG AAAG AGCT T TAT ATTGTAAAAGGGGCAAAACATGCG 
AAAT C T T T T GAAa C AG AG C C AG AAAAAT AT GAGAAACGT AT CT C T AGT T T 
T T T G AAAAAAT AT GAAAAA 

SEQ ID NO. 8306 

STRAIN M732 

GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTCGA 

GAT GAT AAAT C C T T T AT T T C AAAT GGT C AACGT AAG C CT GG AAAC T C T T T 
AT AT G C T TAT GAT AAAT C CTT T GAT AAGC T AT TAAAGC AAAAAAT AGAAA 
TGACAAACC AAAAT AT AAAG C AAGT TGCTTGGTATGTTCCTGCTGCT AAG 
AAAACT CAT AAG AC AGT T GTTGTCGTT CAT GGT T T T GC GAAT AG C AAAG A 
GAATATGAAGGCATATGGTTGGCTGTTTCATAAGTTAGGATACAATGTTC 
TTATGCCTGACAACATTGCACATGGTGAAAGTCATGGGCAGTTGATAGGC 
TAT GG CT GG AAC G AC CG C GAG AAC AT TAT C AAAT G G AC AG AAAT GAT AGT 
GGATAAGAATCCATCAAGCCAAATTaCTTTATTTGGTGTTTCAATGGGTG 
GAGCAACAGTCATGATGGCTAGTGGTGAAAAATTACCTAGTCAGGTTGTT 
AATATCATTGAAGATTGTGGTTATTCTAGTGTTTGGGATGAATTAAAATT 
T C AG GC T AAAG AGAT GT AT G GT T T AC C AG C C T T C C C ACT CT T AT AT G AAG 
T T T C AAC AAT T T C T AAAAT C AG AG C AG GT T TT T C GT AT GG AC AAg C AAGT 
AGT GT CG AAC AAT T GAAAAAGAAT AAT T T ACC AG C C CT cT T T AT T C AT GG 
T G AT AAGGAT AAT T T T G T T C C AAC AAG TAT GGT T TAT G AC AACT AT AAAG 
CTACAGCAGGTAAGAAAGAGCTTTATATTGTAAAAGGGGCAAAACATGCG 
AAAT C T T T T G AAAC AGAG C C AG AAAAAT AT GAG AAACGT AT CT C T AGT T T 
T T T G AAAAAAT AT GAAAAA 

SEQ ID NO. 8307 

STRAIN COH1 

GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTC 

GAGATGATAAAT CCT TTAT TT C AAAT GGT CAACGTAAGCCTGGAAACTCT 
T TAT AT G C T TAT GAT AAAT C C T T T GAT AAG C T AT TAAAGC AAAAAAT AG A 
AATGaC AAAC C AAAAT AT AAAGC AAGT TGCTTGGTATGTTCCTGCTGCT A 
AG AAAACT CAT AAG AC AGT TGTTGTCGTT CAT GGTTTTGC GAAT AG C AAA 
GAG AAT AT G AAG G CAT AT GGTTGGCTGTT T CAT AAG T TAG GAT AC AAT GT 
T C T TAT G C CT G AC AAC AT T G C AC AT GGT G AAAGT CAT G GGC AG T T GAT AG 
G C T AT GGC T GG AAC G AC CG C GAG AAC AT TAT C AAAT GG AC AG AAAT GAT A 
GTGGATAAGAATCCATCAAGCCAAATTACTTTATTTGGTGTTTCAATGGG 
T GG AGC AAC AGTC AT G AT GGCT AGT GGTG AAAAAT TACCT AGT CAGGTTG 
TTAATATCATTGAAGATTGTGGTTATTcTAGTGTTTGGGATgAATTAAAA 
TTTCAGGCTAAAGAGATGTATGGTTTACCAGCCTTCCCACTCTTATATGA 
AGT T T C AAC AAT T T C T AAAAT C AG AGC AG G T T T T T C GT AT G G AC AAGC AA 
GTAGTGTCGAACAATTGAAAAAGAATAATTTACCAGCCCTcTTTATTCAT 
GGT GAT AAGG AT AAT T T T G T T C C AAC Aa G TAT GGT T T AT G AC AACT AT AA 
AG C T AC AGC AG G T AAG AAAG AG C T T TAT AT T GT AAAAGG GG C AAAAC AT G 
C GAAAT CT T T T GAAa C AG AG C C AG AAAAAT AT GAG AAAC GT AT C T C T AGT 
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TTTTTGAAAAAATATGAAAAA 

SEQ ID NO. 8308 

STRAIN M781 

GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTCG 

AGATGATAAATCCTTTATTTCAAATGGTCAACGTAAGCCTGGAAACTCTT 
TAT AT G C T TAT G AT AAAT C CT T T G AT AAG CT AT T AAAG C AAAAAAT AG AA 
ATGACAAACCAAAATATAAAGCAAGTTGCTTGGTATGTTCCTGCTGCTAA 
GAAAACTCATAAGACAGTTGTTGTCGTTCATGGTTTTGCGAATAGCAAAG 
AG AAT AT GAAG GC AT AT GGTTGGCTGTT T CATAAGT TAGGATACAAT GTT 
CTTATGCCTGACAACATTGCACATGGTGAAAGTCATGGGCAGTTGATAGG 
CTATGGCTGGAACGACCGCGAGAACATTATCAAATGGACAGAAATGATAG 
TGGATAAGAATCCATCAAGCCAAATTaCTTTATTTGGTGTTTCAATGGGT 
GGAGC AAC AGT CAT GAT G G C T AG T GGT G AAAAAT T AC C T AGT C AG GT T G T 
T AAT AT CAT T GAAG AT T GT GG T T AT T c T AGT G T T T G GG AT g AAT T AAAAT 
TTCAGGcTAAAGAGATGTATGGTTTACCAGCCTTCCCACTcTTATATGaA 
GTTTCAacAATTTcTAAAATcAgAGCAGGTTTTTCGTATGGACaAgCAAG 
T Ag T GT CGAAC AAT t G AAAAAG AAT AAT T T AC C AG C C C T c T T TAT T CAT G 
GT GAT AAG GAT AAT T T T G T T C C AAC Aa G TAT G GT T T AT G a C Aa C TAT AAA 
GCTACAGCAGGTAAGAAAGAGCTTTATATTGTAAAAGGGGCAAAACATGC 
GAAAT C T T T T G AAa C AG AG C C AG Aa a AAT AT GAG AAAC G T AT C T CT AGT T 
T T T T G AAAAAAT AT G AAAAA 

SEQ ID NO. 8309 

STRAIN CJB110 

GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTCGAG 
ATGATAAATCCTTTATTTCAAATGGTCAACGTAAGCCTGGAAACTCTTTA 
T AT GC T TAT GAT AAAT C C T T T GAT AAG C TAT T AAAG C AAAAAAT AGAAAT 
GACAAACCAAAATATAAAGCAAGTTGCTTGGTATGTTCCTGCTGCTAAGA 
AAACTCATAAGACAGCTGTTGTCGTTCATGGTTTTGCGAATAGCAAAGAG 
AAT AT G AAGG C AT AT GGTTGGCTGTTT CAT AAG T T AG GAT AC AAT GT T c T 
TAT G C C T G AC AAT AT T GC AC AT GGT G AAAG T CAT G G G C AGT T G AT AGG CT 
• AT G G CT G G AAC G AC C G C GAG AAC AT TAT C AAAT G G AC AG AAAT GAT AG T T 
GATAAGAATCCATCAAGCCAAATTACTTTATTTGGTGTTTCAATGGGTGG 
AG C AAC AGT CAT GAT G GC T AG T GGT G AAAAAT T AC C T AGT C AGGT T GT T A 
AT AT CAT T GAAG AT T G C GG T TAT T c T AGT GT T T GG GAT g AAT T AAAAT T T 
C AG G C T AAAGAGAT GT AT G GT T T AC C AG C CT T C C C AC T CT T AT AT GAAG T 
T T C AAC AAT T T C T AAAAT C AG AG C AG GTTTTTCG TAT G G AC AAgC AAGT A 
g T GT CGAAC AAT T G AAAAAG AAT AAT T T AC C AG C C C T c T T T AT T CAT GGT 
GAT AAG GAT AAT T T T GT T C C AAC AAG TAT GGT T TAT GAC AACT AT AAAG C 
TACAGCAGGTAAGAAAGAGCTTTATATTGTAAAAGGGGCAAAACATGCGA 
AAT CT T T T G AAAC AG AG C C AG AAAAAT AT GAG AAAC GT AT CT C T AGT T T T 
TTG AAAAAAT AT GAAAAA 

SEQ ID NO. 8310 

STRAIN 1169NT , 
GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTCGA 

GAT GAT AAAT CCTTT ATT TC AAAT GGT CAACGTAAGCCTGGAAACTCTTT 
AT AT GC T T AT GAT AAAT C C T T T GAT AAG C T AT T AAAG C AAAAAAT AG AAA 
TGACAAACCaAAATATAAAGCAAGTTGCTTGGTATGTTCCTGCTGCTAAG 
AAAACTCATAAGACAGCTGTTGTCGTTCATGGTTTTGCGAAtAGCAAAGA 
g AAT AT GAAGG C AT AT GG TTGGCTGTTT CAT AAG T TAG GAT AC AAT GT T c 
T T AT AC C T GAC AAT AT T G C AC AT GGT G AAAG T CAT G G G C AGT T GAT AG G C 
T AT GGC TG G AAC GAC C G C GAG AAC AT TAT C AAAT G GAC AG AAAT GAT AGT 
TGATAAGAATCCATCAAGCCAAATTACTTTATTTGGTGTTTCAATGGGTG 
GAG C AAC AG T CAT GAT GG C T AGT GGT G AAAAAT T AC C T AGT C AG G T T GT T 
AATATCATTGAAGATTgCGGTTATTcTAGTGTTTGGGATgAATTAAAATT 
TCAGGCTAaAGAGATGTATGGTTTaCCAGCCTTCCCACTcTTATATGAAG 
TTTCAACAATTTCTAAAATCAGAGCAGGTTTTTCGTATGGACAAGCAAGT 
AGT GT AG AAC AAT T G AAAAAG AAT AAT T T AC C AG C C C T CT T TAT T CAT GG 
T GAT AAG GAT AAT T T T GT T C C AAC AAGT AT GGT T T AT GAC AAC TAT AAAG 
CT AC AG C AG GT AAG AAAG AG C T T TAT AT T GT AAAAG G GG C AAAAC AT G C G 
AAAT CT T T T G AAa C AG AG C C AG AAAAAT AT GAG AAAC GT AT CT C T AGT T T 
T T T G AAAAAAT AT GAAAAA 
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SEQ ID NO. 8311 

STRAIN OM9130013 

GCTAGTTTTTATTTTTTCCACGTTGCCCAAGTTCG 

AGATGATAAATCCTTTATTTCAAATGGTCAACGTAAGCCTGGAAACTCTT 
TAT AT G CT TAT G AT AAAT C CT T T G AT AAG C T AT T AAAG C AAAAAAT AGAA 
ATGaCAAACCAAAATATAAAGCAAGTTGCTTGGTATGTTCCTGCTGTTAA 
GAAAACTCATAAGACAGCTGTTGTCGTT CAT GGTTTTGCGAATAGC AAAG 
AGAATATGAAGGCATATGGTTGGCTGTTTCATAAGTTAGGATACAATGTT 
CTTATGCCTGACAATATTGCACATGGTGAAAGTCATGGGCAGTTGATAGG 
C T AT GGC T GG AACGAC C G C GAGAAC AT TAT C a AAT GG AC AG AAAT GAT AG 
TTGATAAGAATCCATCAAGCCAAATTaCTTTATTTGGTGTTTCAATGGGT 
GGAGCAACAGTCATGATGGCTAGTGGTGAAAAATTACCTAGTCAGGTTGT 
T AAT AT CAT T G AAG AT T G C G GT TAT T cT AGT GT T T GGG AT g AAT T AAAAT 
T T C AGGCT AAAG AG AT GT AT GGT T T AC C AG C CT T C C C ACT C T TAT AT G AA 
GTTTCAACAATTTCT AAAAT CAGAGCAGGTTTTTCGTATGGACAAGCAAG 
TAG T GT CG AAC AAT T G AAAAAG AAT AAT T T AC C AGC C C T CT T T AT T CAT G 
GT G AT AAGGAT AAT T T T GT T C C AAC AAGT AT G GT T TAT G AC AAC TAT AAA 
GCTACAGCAGGTAAGAAAGAGCTTTATATTGTAAAAGGGGCAAAACATGC 
GAAATCTTTTGAAACAGAGCCAGAAAAATATGAGAAACGTATCTCTAGTT 
T T T T G AAAAAAT AT G AAAAA 

SEQ ID NO. 8312 
STRAIN 2 603 frame: 1 

MKKIRLSKFIKMIWILFLISVAASFYFFHVAQVRDDKSFISNGQRKPGNSLYAYDKSFD 
KLLKQK I EMTNQN I KQ VAW Y V P AVKKT HKT AVWHG FAN S KENMKAYGWL FHKLGYN VLM 
PDNIAHGESHGQLIGYGWNDRENIIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKL 
PSQVVNIIEDCGYSSVWDELKFQAKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQL 
KKNNL PAL FIHGDKDNFVPTSMVYDNYBCATAGKKELYIVKGAKHAKS FETE PEKYEKRIS 
SFLKKYEK 

SEQ ID NO. 8313 

STRAIN 090 frame: 1 

ASFYFFHVAQVRDDKSFISNGQRKPGNSLYAYDKSFDKLLKQKIEMTNQNIKQVAWYVPA 
AKKTHKT AVWHG FAN S KENMKAYGWL FHKL G YN VLM P DN I AHGE S HGQL I G YGWN DREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQVVNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVS T I SKIRAGFS YGQAS SVEQLKKNNLPALFIHGDKDNFVPTSMV 
YDN YKATAGKKELY I VKGAKHAKS FETE PEKYEKRIS S FLKKYEK 

SEQ ID NO. 8314 

STRAIN A909 frame: 3 

SFISNGQRKPGNSLYAYDKSFDKLLKQKI EMTNQN I KQV AW YVPAAKKTHKT AVWHG FA 
NSKENMKAYGWLFHKLGYNVLMPDNIAHGESHGQLIGYGWNDRENIIKWTEMIVDKNSSS 
QITLFGVSMGGATVMMASGEKLPSQWNIIEDCGYSGVWDELKFQAKEMYGLPAFPLLYE 
V S T I S K I RAG F S Y GQ A S S VE Q LKKNNL PAL F I HG DK DN FV PT S MV Y DN YKAT AGKKE L Y I 
VKGAKHAKS FETE PEKYEKRIS S FLKKYEK 

SEQ ID NO. 8315 

STRAIN H3 6B frame: 1 

S FYFFHVAQVRDDKS FI SNGQRKPGNSLYAYDKS FDKLLKQKIEMTNQNIKQVAWYVPAA 
KKTHKTAVWHGFAN SKENMKAYGWL FHKLGYN VLMP DN I AHGE SHGQL I GYGWN DREN I 
IKWTEMIVDKNSSSQITLFGVSMGGATVMMASGEKLPSQVVNIIEDCGYSGVWDELKFQA 
KEMYGLPAFPLLYEVSTISKIRAGFS YGQAS SVEQLKKNNLPALFIHGDKDNFVPTSMVY 
DN YKAT AGKKELYI VKGAKHAKS FETE PEKYEKRI S S FLKKYEK 

SEQ ID NO. 8316 

STRAIN 18RS21 frame: 1 

AS FYFFHVAQVRDDKS FISNGQRKPGNSLYAYDKSFDKLLKQKIEMTNQNIKQVAWYVPA 
VKKTHKT AVWHG FANS KENMKAYGWL FHKLG YNVLMPDN I AHGE S HGQL I GYGWNDREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQWNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVSTISKIRAGFSYGQAS SVEQLKKNNLPALFIHGDKDNFVPTSMV 
YDN YKAT AGKKELY I VKGAKHAKS FETE PEKYEKRI S S FLKKYEK 

SEQ ID NO. 8317 
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STRAIN M732 frame: 1 

AS FYFFHVAQVRDDKS FI SNGQRKPGNSLYAYDKS FDKLLKQKIEMTNQNIKQVAWYVPA 
AKKTHKTWWHGFANSKENMKAYGWLFHKLGYNVLMPDNIAHGESHGQLIGYGWNDREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQVVNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQLKKNNLPALFIHGDKDNFVPTSMV 

Y DN YKAT AGKKE L Y I VKGAKHAK S FE T E P E KYE KR I S S FLKK YE K 

SEQ ID NO. 8318 

STRAIN COH1 frame: 1 

AS FYFFHVAQVRDDKS FISNGQRKPGNSLYAYDKS FDKLLKQKIEMTNQNIKQVAWYVPA 
AKKT HKTWWH G FAN S KE NMKAY GWL FHKL G YN VLM P DN I AHGE S HG QL I G YGWN DREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQWNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQLKKNNLPALFIHGDKDNFVPTSMV 
YDNYKATAGKKELYI VKGAKHAKS FETE PEKYEKRI S S FLKKYEK 

SEQ ID NO. 8319 

STRAIN M781 frame: 1 

AS FYFFHVAQVRDDKS FISNGQRKPGNSLYAYDKS FDKLLKQKIEMTNQNIKQVAWYVPA 
AKKTHKTWWHGFANSKENMKAYGWLFHKLGYNVLMPDNIAHGESHGQLIGYGWNDREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQWNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQLKKNNLPALFIHGDKDNFVPTSMV 
YDNYKATAGKKELYI VKGAKHAKS FETE PEKYEKRI S S FLKKYEK 

SEQ ID NO. 8320 

STRAIN CJB110 frame: 1 

AS FYFFHVAQVRDDKSFISNGQRKPGNSLYAYDKS FDKLLKQKIEMTNQNIKQVAWYVPA 
AKKTHKTAVWHGFANSKENMKAYGWLFHKLGYNVLMPDNIAHGESHGQLIGYGWNDREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQVVNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQLKKNNLPALFIHGDKDNFVPTSMV 

Y DN YKAT AGKKE L Y I VKGAKHAK SFETEPEKYE KR I S S FLKK YE K 

SEQ ID NO. 8321 

STRAIN 1169NT frame: 1 

AS FYFFHVAQVRDDKS FISNGQRKPGNSLYAYDKS FDKLLKQKIEMTNQNIKQVAWYVPA 
AKKTHKTAVWHGFANSKENMKAYGWLFHKLGYNVLIPDNIAHGESHGQLIGYGWNDREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQWNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQLKKNNLPALFIHGDKDNFVPTSMV 
YDNYKATAGKKELYI VKGAKHAKS FETEPEKYEKRISS FLKKYEK 

SEQ ID NO. 8322 

STRAIN JM9130013 frame: 1 

AS FYFFHVAQVRDDKSFISNGQRKPGNSLYAYDKS FDKLLKQKIEMTNQNIKQVAWYVPA 
VKKT HKT AVWHG FAN S KENMKAYGWL FHKLG YN VLM P DN I AHGE S HGQLI G YGWN DREN 
IIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKLPSQVVNIIEDCGYSSVWDELKFQ 
AKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQLKKNNLPALFIHGDKDNFVPTSMV 
YDN YKAT AGKKE LY I VKGAKHAKS FETE PEKYEKRI S S FLKKYEK 

SEQ ID NO. 8401 
STRAIN 2603 

ATGATGAAAGTTTTAGCCTTTGATACTTCAAGCAAAGCACTATCAGTGGCTGTACTAAAC 
AATATGGAATGTTTAGCGACTGTCACTATCAATATCAAAAAGAATCATAGCATTAATTTG 
AT GC C AG C CAT T GAT T T T T T AAT G C AAT C AAT T GAT T T AG AAC C T C AAG AT T T GG AC CGT 
AT C GT AGT AGC AG AGG G T C C AG GAT C T T AT AC GG G CT T AC G T GT AG CT GT T GC T AC AG C A 
AAAATGCTAGCTTATACGCTTAAGATTGACTTAGTTGGAGTATCTAGCCTGTACGCTTTA 
AC AAAT GG AT T T T C AGAAAAT GAT T T AT T G GT AC C ACT TAT AG AT G C AC G ACGT AAT AAT 
GTTTATGTTGGTTTCTATCAAAATGGTGATACTGTTAAACCAGACTGTCACACTTCTCTT 
G AAG AAGT C T T AC AAG AG G T GG GG AAT AAAG C C AAT GT T CAT T T T G T C G G AG AGGT T G C A 
G C AT t T T T T GAT C AG AT T AAG AAAG C C T T AC C AC AT G C T AAAAT T AC AG AAAC T T T AC CT 
TGTGCAGTAGCAATTGGGCGCAAAGGACAAAAAATGAAAAGCGTTAATGTAGATGCGTTT 
GTTCCACGATACTTAAAACGTGTTGAAGCTGAGGAAAATTGGTTAAAAAACCACTGTGAA 
ACGAAT AC AGAAGAAT ATATT AAG AGAGT T 

SEQ ID NO. 8402 

STRAIN 0 90 
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AAAGT T T T AGC CT T T GAT AC T T C AAG C AAAG C AC TAT C AGT G G C T GT ACT 
AAAC AAT AT GGAAT GT T T AGC G ACT GT C ACT a T C AAT AT CAAAAAGAAT C 
AT AG CAT T AAT T T G AT GC C AG C CAT T GAT TT T T T AAT G C AAT C AAT T GAT 
T T AGAAC C T C AAGAT T T G GAC C GT AT C GT AGT GGC AGAGGGT C C AGG AT C 
T T AT AC GGGCT T AC GT GT AGC T GT T G C TAC AG C AAAAATGCT AG CT TATA 
CGCTTAAGATTGACTTAGTTGGAGTATCTAGCCTGTACGCTTTAACAAAT 
GGAT T T T C AG AAAAT GAT T T GT T G GT AC C ACT T AT AGAT G C ACG AC GT AA 
C AAT GT T T AT GT T GGT T T C T AT C AAAAT G GT G ATAC T GT T AAAC C Ag AC T 
GTCACACTTCTCTTGAAGAAGTCTTACAAGAGGTGGGGAATAAAGCCAAT 
GT T CAT T T T GT C GGAG AGGT T G C AGC AT T T T T T GAT C AGAT T AAg AAAGC 
CT TAC C AC AT G C T AAAAT T AC AGAAAC T T TAC C T T G T G C AGT GG C AAT T G 
GGCGCAAAGGACAAAAAATGGAAAGCGTTAATGTAGATGCGTTTGTTCCA 
C GAT ACT T AAAAC G AGT T G AAGCT G AGG AAAAT T GGT T AAAAAAC C AC T G 
TGAAACGAAT 

SEQ ID NO. 8403 

STRAIN A909 

AAAGTTTTAGCCTTTGATACTTCAAGCAAAGCACTATCAG 

T GG C T GT AC T AAAC AAT AT G GAAT GT T TAG C GAC T GT C AC TAT C AAT AT C 

AAAAAG AAT CAT AG CAT T AAT T T GAT G C C AG C CAT T GAT T T T T T AAT G C A 

AT C AAT T GAT T T AGAAC C T C AAGAT T T GGAC C GT AT C GT AGT AGC AG AGG 

GTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCAAAAATG 

C T AG CT T AT AC G C T T AAG AT T GACT T AGT T GGAG TAT C T AGC CT GT ACG C 

T T T AAC AAAT GGAT T T T C AG AAAAT G ATT TAT T G GT AC C AC T TAT AG AT G 

CACGACGTAACAATGTTTATGTTGGTTTCTATCAAAATGGAGATACTGTT 

AAAC C AGAC T GT C AC AC T T C T C T T GAAG AAGT C T T AC AAGAGGT G GG G AA 

TAAAGCCAATGTTCATTTTGTCGGAgAGGTTGCAgCATTTGTTGACCAGA 

tTAAgAAAGTTTTACCACATGCTAAAATTACAGAAACTTTACCTTGTGCA 

GtGGCAATTGGGCGCAAAGGACAAAAAATGAAAAGCGTTAATGTAGATGC 

GTTTGTTCCACGATACTTAAAACGTGTTGAAGCTGAGGAAAATTGGTTAA 

GAAACCACTGTG AAAC GAAT 

SEQ ID NO. 8404 

STRAIN H3 6B 

AAAGT T T T AGC CT T T GAT ACT T C AAGC AAAG C AC TAT C A 
GTGGCTGTACTAAACAATATGGAATGTTTAGCGACTGTCACTATCAATAT 
CAAAAAGAAT CAT AG C AT T AAT T T GAT G C C AG C CAT T GAT T T T T T AAT G C 
AAT C AAT T GAT T TAG AAC CT C AAG ATT T GG ACC GT AT C G TAG TAG C AG AG 
GGTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCAAAAAT 
G CT AG C T T AT AC GC T T AAGAT T GAC T T AGT T GG AGT AT CT AG C C T GT AC G 
CT T T AAC AAAT GGAT T T T C AG AAAAT GAT T TAT T G GT AC C ACT TAT AG AT 
GC AC GAC GT AAC AAT GT T T AT GT T G GT T T C TAT C AAAAT G GAG AT ACT GT 
T AAAC C AG AC T G T C AC ACT T CT C T T GAAG AAGT C T T AC AAG AG GT G GG G A 
ATAAAGCCAATGTTCATTTTGTCGGAGAGGTTGCAGCATTTGTTGACCAG 
AT T AAG AAAGT T T T AC C AC AT G C T AAAAT TAC AG AAAC T T TAC C T T GT G C 
AGT GG C AAT T G G G C G C AAAG G AC AAAAAAT G AAAAG C G T T AAT GT AG AT G 
CGTTTGTTC C AC GAT AC T T AAAAC GT GT T G AAGC T GAG G AAAAT T G GT T A 
AGAAAC C AC T GT G AAAC G AAT AC AG AAGAAT AT AT T AAG AG AG T T 

SEQ ID NO. 8405 

STRAIN 18RS21 

AAAGT TT TAGC CT T TGAT ACT T CAAGCAAAGCACT AT C A 

GT GG C T GT AC T AAAC AAT AT G GAAT GT T T AGC GACT GT C AC TAT C AAT AT 

CAAAAAGAAT CAT AGCATTAATT T GAT GC CAGCCAT TGAT T T T TTAATGC 

AAT C AAT T GAT T TAG AAC C T C AAG AT T T GG AC CG T AT C G T AGT AG C AGAG 

GGTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCAAAAAT 

GCTAGCTTATACGCTTAAGATTGACTTAGTTGGAGTATCTAGCCTGTACG 

C T T T AAC AAAT GGAT T T T C AG AAAAT GAT T T AT T GGT AC C AC T TAT AG AT 

GCACGACGTAATAATGTTTATGTTGGTTTCTATCAAAATGGTGATACTGT 

T AAAC C AG ACT G T C AC AC T T CT C T T GAAG AAGT CT T AC AAGAGG T GG GG A 

ATAAAGCCAATGTTCATTTTGTCGGAGAGGtTGCAGCATTTTTTGATCAg 

AT T AAg AAAG C CT T AC C AC AT G C T AAAAT TAC AG AAAC T T T AC C T T G T G C 

AGT AG C AAT T G GG C G CAAAGG AC AAAAAAT GAAAAG C G T T AAT GT AG AT G 

CGTTTGTTCCACGATACTTAAAACGTGTTGAAGCTGAGGAAAATTGGTTA 
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AAAAAC C AC T GT G AAAC GAAT AC AG AAG AAT AT AT T AAG AGAGT T 

SEQ ID NO. 8406 

STRAIN M732 

AAAGT T T TAG C CT TT GAT AC T T C AAG C AAAGC ACT AT C A 
GTGGCTGTACTAAACAATATGGAATGTTTAGCGACTGTCACTATCAATAT 
C AAAAAG AAT CAT AG CAT T AAT T T GAT G C C AG C CAT T GAT T T T T T AAT G C 
AAT CAAT T G ATT T AG AAC CT C AAG AT T T GG AC C GT AT C GT AGT AG C AG AG 
GGTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCAAAAAT 
G C T AG CT T AT ACG CT T AAG AT T G AC T T AGT T G G AGT AT C TAG C C T GT AC G 
C TT T AAC AAAT GG AT T T T C AGAAAAT GAT T T AT T G GT AC C AC T TAT AG AT 
GCACGACGTAACAATGTTTATGTTGGTTTCTATCAAAATGGTGATACTGT 
TAAACCAGACTGTCACACTTCTCTTGAAGAAGTCTTACAAGAGGTGGGGA 
ATAAAGCCAATGTTCATTTTGTCGGAGAGGTTGCAGCATTTTTTGATCAG 
AT T AAGAAAG C CT T AC C AC AT G C T AAAAT T AC AG AAAC T T T AC CT T G T G C 
AG TAG CAAT T GGG CG C AAAG G AC AAAAAAT G AAAAG C G T T AAT GT AG Ann 
CGTTTGTTCCACGATACTTAAAACGTGTTGAAGCTGAGGAAAATTGGTTA 
AAAAAC C AC T GT G AAAC GAAT AC AG AAGAAT AT AT T AAG AG AGT T 

SEQ ID NO. 8407 

STRAIN COH1 

AAAGT T T TAGC CT T TGAT ACT T CAAGCAAAGC AC 

TATCAGTGGCTGTACTAAACAATATGGAATGTTTAGCGACTGTCACTATC 
AAT AT C AAAAAG AAT CAT AG CAT T AAT T T GAT G C C AG C CAT T GAT T T T T T 
AAT G CAAT CAAT T GAT T TAG AAC C T C AAG AT T T GG AC C G T AT C G T AGT AG 
CAGAGGGTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCA 
AAAAT G C TAG C T T AT AC GC T T AAG AT T G AC T T AGT T G G AGT AT C T AG C C T 
GT AC G CT T T AAC AAAT G GAT T T T C AG AAAAT GAT T TAT T GGT AC C AC T T A 
TAGATGCACGACGTAACAATGTTTATGTTGGTTTCTATCAAAATGGTGAT 
AC T GT T AAAC C AG AC T GT C AC ACT T C T C T T G AAGAAGT C T T AC AAG AGGT 
GGGGAATAAAGCCAATGTTCATTTTGTCGGAGAGGTTGCAGCATTTTTTG 
AT C AG AT T AAG AAAG C CT T AC C AC AT GC T AAAAT T AC AG AAAC T T T AC C T 
T GT G C AGT AGC AAT T GGG C GC AAAG G AC AAAAAAT G AAAAG CGT T AAT GT 
AG AT G C GTT T GT T C C ACG AT AC T T AAAAC GT GT T G AAG CT GAGG AAAAT T 
GGT T AAAAAAC C AC T GT G AAAC G AAT AC AGAAGAAT AT AT T AAGAG AGT T 

SEQ ID NO. 8408 

STRAIN M781 

AAAGT T T TAGC CT T T GAT AC T T C AAG C AAAGC AC T A 

T C AGT GGC T G TACT AAAC AAT AT G G AAT GT T T AG CGACT G T C AC TAT C AA 
TAT C AAAAAG AAT CAT AG C AT T AAT T T GAT G C C AG C CAT T GAT T T T T T AA 
T GCAAT CAAT T GAT T T AGAAC CT C AAG AT T T GGAC C GT AT C GT AGT AT C A 
G AGGGT C C AG GAT C T TAT AC GGG C T T AC GT GT AG C T GT T G C T AC AG C AAA 
AAT G CT AG CT TAT AC G C T T AAG AT T G AC T T AGT T G G AGT AT C T AG C C T GT 
ACGC T T T AAC AAAT G GAT T T T C AG AAAAT GAT T T AT T G G T AC C AC T TATA 
GATGCACGACGTAACAATGTTTATGTTGGTTTCTATCAAAATGGTGATAC 
T GT T AAAC C AG AC T GT C AC AC T T CT C T T G AAG AAGT CT T AC AAG AGGT G G 
GGAATAAAGCCAATGTTCATTTTGTCGGAGAGGTTGCAGCATTTTTTGAT 
C AG AT T AAG AAAG C C T T AC C AC AT G C T AAAAT T AC AGAAACT TTACCTTG 
T G C AG TAG CAAT T GGG C G CAAAGG AC AAAAAAT G AAAAG C GT T AAT GT AG 
ATGCGTTTGTTCCACGATACTTAAAACGTGTTGAAGCTGAGGAAAATTGG 
T T AAAAAAC C AC T G T G AAAC GAAT AC AG AAG AAT AT AT T AAG AG AG T T 

SEQ ID NO. 8409 

STRAIN CJB110 

AAAGT T T T AG C C T T T GAT AC T T C AAG C AAAG C AC TAT C A 
GTGGCTGtaCTAAACAATATGGAATGTTTAGCGACTGTCACTATCAATAT 
C AAAAAGAAT C AT AGCAT T AATT T GATGCCAGCC AT TGAT T T TT T AAT GC 
AAT C AAT T GAT TT AGAAC CTC AAG ATTTGGACCGT AT CGT AGTGGCAGAG 
GGTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCAAAAAT 
GCT AG C T T AT AC GC T T AAG AT T G AC T T AGT T GG AGT AT C T AG CCTGTACG 
CTTTAACAAATGGATTTTCAGAAAATGATTTGTTGGTACCACTTATAGAT 
G C AC G AC GT AAC AAT GT T TAT GTTGGTTT CT AT C AAAAT GGT GAT AC T G T 
T AAAC C AG AC T GT C AC AC T T C T C T T G AAG AAGT C T T AC AAG AGGT G G G GA 
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ATAAAGCCAATGTTCATTTTGTCGGAGAGGTTGCAGCATTTTTtgATCAG 
ATTAAGAAAGCCTTACCACATGCTAAAATTACAGAAACTTTACCTTGTGC 
AGT GGCAATT GGGCGCAAAGGACAAAAAATGGAAAGCGTT AATGTAgAT G 
C GT T T GT T C C AC GAT AC T T AAAACG AGT T G AAG C T GAG GAAAAT T GGT T A 
AAAAACCACTGTGAAACGAATACAGAAGAATATAT TAAGAGAGT T 

SEQ ID NO. 8410 

STRAIN 1169NT 

AAAGTTTTAGCCTTTGATACTTCAAGCAAAGCACTATCA 
GTGGCTGTACTAAACAATATGGAATGTTTAGCGACTGTCACTATCAATAT 
C AAAAAG AAT C AT AGC AT T AAT T T GAT G CC AGC C a T T GAT T T T T T AAT GC 
AAT CAATTGATT T AGAACCT CAAGAT TTGGACCGT AT CGT AGT AGCAGAG 
GGTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCAAAAAT 
GCTAGCTTATACGCTTAAGATTGACTTAGTTGGAGTATCTAGCCTGTACG 
C T T T AAC AAAT GG AT T T T C AGAAAAT GAT T TAT T GGT AC C ACT T AT AGAT 
GCACGACGTAACAATGTTTATGTTGGTTTCTATCAAAATGGTGATACTGT 
TAAACCAGACTGTCACACTTCTCTTGAAGAAGTCTTACAAGAGGTGGGGA 
AT AAAGC C AAT GT TC AT T T T GT CGGAg AGGT T G C AGC AT T T GT T G AC C AG 
AT T AAG AAAGC T T T ACC AC At G CT AAAATT AC AG AAAC T T T AC C T T GT G C 
AGTGGCAATTGGGCGCAAAGGACAAAAAATGGAAAGCGTTAATGTAgATG 
CGTTTGTTCCACGATACTTAAAACGTGTTGAAGCTGAgGAAAATTGGTTA 
AAAAAC C AC T GT GAAACG AAT AC AGAAG AAT AT AT TAAGAGAGT T 

SEQ ID NO. 8411 

STRAIN JM9130013 

AAAGT T T T AG C C T T T GAT ACT T C AAG C AAAG C ACT AT C A 
GTGGCTGTACTAAACAATATGGAATGTTTAGCGACTGTCACTATCAATAT 
C AAAAAG AAT CAT AG CAT T AAT T T GAT G C C AG C C AT T GAT T T T T T AAT GC 
AAT C AAT T GAT T T AGAAC C T CAAGAT T T GG AC C GT AT C GT AGT AGC AG AG 
GGTCCAGGATCTTATACGGGCTTACGTGTAGCTGTTGCTACAGCAAAAAT 
gCTAGCTTATACGCTTAAGATTGACTTAGTTGGAGTATCTAGCCTGTACG 
C T T T AAC AAAT G GAT T T T C AGAAAAT GAT T TAT T G GT AC C ACT TAT AG AT 
GCACGACGTAACAATGTTTATGTTGGTTTCTATCAAAATGGAGATACTGT 
T AAAC C AG AC T GT C AC AC T T C T CT T GAAG AAG T C T T AC AAG AG GT GG G G A 
AT AAAG C C AAT G T T CAT T T T GT CGG AGAG G T T GC AG C AT T T GT T G AC C AG 
AT T AAG AAAGT T T T AC C AC AT G CT AAAAT T AC AG AAAC T T T AC C T T GT GC 
AGTGGCAATTGGGCGCAAAGGACAAAAAATGAAAAGCGTTAATGTAGATG 
CGT T TGTT C CACGATACTTAAAACGT GTTGAAGCTG AGG AAAATT GGT T A 
AG AAAC C ACT GTGAAAC G AAT AC AGAAG AAT AT AT T AAG AG AG T T 

SEQ ID NO. 8412 
STRAIN 2 603 frame: 1 

MMKVLAFDTSSKALSVAVLNNMECLATVTINIKKNHSINLMPAIDFLMQSIDLEPQDLDR 
IWAEGPGSYTGLRVAVATAKMLAYTLKIDLVGVSSLYALTNGFSENDLLVPLIDARRNN 
VYVGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFFDQIKKALPHAKITETLP 
CAVAIGRKGQKMKSVNVDAFVPRYLKRVEAEENWLKNHCETNTEEYIKRV 

SEQ ID NO. 8413 

STRAIN 090 frame: 1 

KVLAFDTSSKALSVAVLNNMECLATVTINIKKNHSINLMPAIDFLMQSIDLEPQDLDRIV 
VAEGPGSYTGLRVAVATAKMLAYTLKIDLVGVSSLYALTNGFSENDLLVPLIDARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFFDQIKKALPHAKITETLPCA 
VAI GRKGQKME S VNVDAFVPRYLKRVE AEEN WLKNHCETN 

SEQ ID NO. 8414 

STRAIN A909 frame: 1 

BCVLAFDTSSKALSVAVLNNMECLATVTINIKKNHSINLMPAIDFLMQSIDLEPQDLDRIV 
VAEGPGSYTGLRVAVATAKMLAYTLKIDLVGVSSLYALTNGFSENDLLVPLIDARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFVDQIKKVLPHAKITETLPCA 
VAIGRKGQKMKSVNVDAFVPRYLKRVEAEENWLRNHCETN 

SEQ ID NO. 8415 

STRAIN H3 6B frame: 1 

KVLAFDTSSKALSVAVLNNMECLATVTINIKECNHSINLMPAIDFLMQSIDLEPQDLDRIV 
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VAEGPGS YTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLI DARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFVDQIKKVLPHAKITETLPCA 
VAIGRKGQKMKSVNVDAFVPRYLKRVEAEENWLRNHCETNTEEYIKRV 

SEQ ID NO. 8416 

STRAIN 18RS21 frame: 1 

KVLAFDTSSKALSVAVLNNMECLATVTINIKKNHSINLMPAIDFLMQSIDLEPQDLDRIV 
VAEGPGS YTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLI DARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFFDQIKKALPHAKITETLPCA 
VAIGRKGQKMKSVNVDAFVPRYLKRVEAEENWLKNHCETNTEEYIKRV 

SEQ ID NO. 8417 

STRAIN M732 frame: 1 

KVLAFDTSSKALSVAVLNNMECLATVTINIKKNHSINLMPAIDFLMQSIDLEPQDLDRIV 
VAEGPGS YTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLIDARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFFDQIKPCALPHAKITETLPCA 
VAIGRKGQKMKSWVXXFVPRYLKRVEAEENWLKNHCETNTEEYIKRV 

SEQ ID NO. 8418 

STRAIN COH1 frame: 1 

KVLAFDT S SKALS VAVLNNMECLAT VT INI KKNHS INLM PAI D FLMQS I DLE PQDL DRI V 
VAEGPGSYTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLIDARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFFDQIKKALPHAKITETLPCA 
VAIGRKGQKMKSVNVDAFVPRYLKRVEAEENWLKNHCETNTEEYIKRV 

SEQ ID NO. 8419 

STRAIN M7 81 frame: 1 

KVLAFDTSSKALSVAVLNNMECLATVTINIKKNHSINLMPAIDFLMQSIDLEPQDLDRIV 
VSEGPGSYTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLI DARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFFDQIKKALPHAKITETLPCA 
VAIGRKGQKMKSVNVDAFVPRYLKRVEAEENWLKNHCETNTEEYIKRV 

SEQ ID NO. 8420 

STRAIN CJB110 frame: 1 

KVLAFDTS SKALS VAVLNNMECLATVTINIKKNHSINLMPAIDFLMQSIDLEPQDLDRIV 
VAEGPGS YTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLI DARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFFDQIKKALPHAKITETLPCA 
VAIGRKGQKMESVNVDAFVPRYLKRVEAEENWLKNHCETNTEEYIKRV 

SEQ ID NO. 8421 

STRAIN 1169NT frame: 1 

KVLAFDTS SKALS VAVLNNMECLAT VTINIKKNH SIN LMPAI DFLMQS I DLE PQDLDRI V 
VAEGPGSYTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLI DARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNKANVHFVGEVAAFVDQIKKALPHAKITETLPCA 
VAIGRKGQKMESVNVDAFVPRYLKRVEAEENWLKNHCETNTEEYIKRV 

SEQ ID NO. 8422 

STRAIN JM9130013 frame: 1 

KVLAFDTS SKALS VAVLNNMECLATVTINIKKNHS INLMPAI DFLMQS I DLE PQDLDRIV 
VAEGPGSYTGLRVAVATAKMLAYTLKI DLVGVS SLYALTNGFSENDLLVPLI DARRNNVY 
VGFYQNGDTVKPDCHTSLEEVLQEVGNPCANVHFVGEVAAFVDQIKKVLPHAKITETLPCA 
VAIGRKGQKMKSVNVDAFVPRYLKRVEAEENWLRNHCETNTEEYIKRV 

SEQ ID NO. 8501 
STRAIN 2603 

atgagtaaacgacaaaatttaggaattagtaaaaaaggagcaattatatcagggctctca 
gtggcactaattgtagtaataggtggctttttatgggtacaatctcaacctaataagagt 
gcagtaaaaactaactacaaagtttttaatgttagagaaggaagtgtttcgtcctcaact 
cttttgacaggaaaagctaaggctaatcaagaacagtatgtgtattttgatgctaataaa 
ggtaatcgagcaactgtcacagttaaagtgggtgataaaatcacagctggtcagcagtta 
gttcaatatgatacaacaactgcacaagcagcctacgacactgctaatcgtcaattaaat 
aaagtagcgcgtcagattaataatctaaagacaacaggaagtcttccagctatggaatca 
agtgatcaatcttcttcatcatcacaaggacaagggactcaatcgactagtggtgcgacg 
aatcgtctacagcaaaattatcaaagtcaagctaatgcttcatacaaccaacaacttcaa 
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gatttgaatgatgcttatgcagatgcacaggcagaagtaaataaagcacaaaaagcattg 
aatgatactgttattacaagtgacgtatcagggacagttgttgaagttaatagtgatatt 
gatccagcttcaaaaactagtcaagtacttgtccatgtagcaactgaaggtaaactccaa 
gtacaaggaacgatgagtgagtatgatttggctaatgttaaaaaagaccaggctgttaaa 
ataaaatctaaggtctatcctgacaaggaatgggaaggtaaaatttcatatatctcaaat 
tatccagaagcagaagcaaacaacaatgactctaataacggctctagtgctgtaaattat 
aaatataaagtagatattactagccctctcgatgcattaaaacaaggttttaccgtatca 
gttgaagtagttaatggagataagcaccttattgtccctacaagttctgtgataaacaaa 
gataataaacactttgtttgggtatacaatgattctaatcgtaaaatttccaaagttgaa 
gtcaaaattggtaaagctgatgctaagacacaagaaattttatcaggtttgaaagcagga 
caaatcgtggttactaatccaagtaaaaccttcaaggatgggcaaaaaattgataatatt 
gaatcaatcgatcttaactctaataagaaatcagaggtgaaa 

SEQ ID NO. 8502 

STRAIN 090 

T T T T T AT GG GT AC AAT C T C AAC C T AAT AAG AGT G C AGT AAAAAC T AAC T A 
CAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCTTTTGA 
C AG G AAAAG C T AAGG CT AAT C AAGAAC AGT AT G T GT AT T T T GAT G CT AAT 
AAAGGTAATCGAGCAACTGTCACAGTTAAAGTGGGTGATAAAATCACAGC 
T G GT C AG C AGT TAG T T C AAT AT GAT AC AAC AAC T G C AC AAGC AG CC T AC G 
AC AC T G C T AAT C GT C AAT T AAAT AAAGT AG C GC GT C AGAT T AAT AAT CT A 
AAG AC AAC AGGAAGT CT T C C AG C T AT G G AAT T AAGTG AT C AAT C TT C T T C 
AT CAT C AC AAGGAC AAGGG AC T C AAT CG AC T AG T GGT G C G AC GAAT C GT C 
TACAGCAAAATTATCAAAGTCAAGCTAATGCTTCATACAACCAACAACTT 
C AAGAT T T GAAT GAT GC T T AT G C AG AT G C AC AGGC AG AAGT AAAT AAAGC 
AC AAAAAGC AT T GAAT GAT ACT GT TAT T AC AAGT G AC GT AT C AGG G AC AG 
T T GT T G AAGT T AAT AGT GAT AT T GAT C C AG C T T C AAAAAC T AGT C AAGT A 
C T T GT C C AT GT AG C AAC T G AAG GT AAAC T C C AAG T AC AAGG AAC GAT GAG 
TGAGTATGATTTGGCTAATGTTAAAAAAGACCAGGCTGTTAAAATAAAAT 
C T AAG GT C TAT C C T GAC AAGG AAT GGGAAG G T AAAAT T T CAT AT AT C T C A 
AATTATCCAGAAGCAGAAGCAAACAACAATGACTCTAATAACGGCTCTAG 
TGCTGTAAATT AT AAAT AT AAAGT AGAT ATTACTAGCCCTCTCGATGC AT 
T AAAAC AAG G T T T T AC C GT AT C AGT T G AAGT AGT T AAT GGAGAT AAGC AC 
CTTATTGTCCCTACAAGTTCTGTGATAAACAAAGATAATAAACACTTTGT 
T T G GGT AT AC AAT GAT T CT AAT CG T AAAAT T T C C AAAGT T G AAGT C AAAA 
T T GG T AAAGC T G AT GC T AAG AC AC AAGAAAT T T TAT C AG GT T T GAAAG C A 
GGAC AAAT CGTGGTT ACT AAT CC AAGT AAAACCTTCAAGGATGGGC AAAA 
AAT T GAT AAT AT T GAAT C AAT C GAT C T T AAC T C T AAT AAG AAAT C AG AG G 

SEQ ID NO. 8503 

STRAIN A909 

T T T T T AT G GGT AC AAT C T C AAC CT AAT AAGAGT G C AGT AAAAACT AA 
CTACAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCTTT 
T GAC AG GAAAAG C T AAGGCT AAT C AAGAAC AGT AT G T G T AT T TT GAT G CT 
AAT AAAGGT AAT CGAGCAACTGTT AC AGTTAAAGTGGGTGAT AAAAT CAC 
AG C T G G T C AG C AGT TAG T T C AAT AT GAT AC AAC AAC T G C AC AAG C AG C C T 
AC GAC AC T G CT AAT CG T C AAT T AAAT AAAGT AGC G C GT C AG AT T AAT AAT 
C T AAAG AC AAC AGG AAGT C T T C C AGC T AT G GAAT C AAG T G AT C AAT C T T C 
AT CAT CAT CAC AAGGAC AAG G G G C T C AAT C G ACT AG T G GT G C G ACG AAT C 
GT C T AC AG C AAAAT TAT C AAAGT CAAGCT AAT G C T T CAT AC AAC C AAC AA 
CTTCAAGATTTGAATGATGCTTATGCAGATGCACAGGCAGAAGTAAATAA 
AGCACAAAAAGCATTGAATGATACTGTTATTACAAGTGACGTATCAGGGA 
C AGT T GTT GAAGTT AAT AGTGAT ATT GAT CCAGCTTC AAAAACT AGT CAA 
GT AC T T GT C CAT GT AG C AACT G AGG G T AAAC T C C AAGT AC AAG G AAC GAT 
GAGTGAGTATGATTTGGCTAATGTTAAAAAAGACCAGTCTGTTAAAATAA 
AAT C T AAGGT C T AT C C T GAC AAG GAAT GGG AAGG T AAAAT T T CAT AT AT C 
TCAAATTATCCAGAAGCAGAAGCAAACAACAATGACTCTAATAACGGCTC 
TAGTGCTGTAAATTATAAATATAAAGTAGATATTACTAGCCCTCTCGATG 
C AT T AAAACAAGGTTTT ACT GT AT CAGTTGAAGT AGT T AAT GGAGAT AAG 
CACCTTATTGTTCCTACAAGTTCTGTGACAAACAAAGATAATAAACACTT 
TGTTTGGGTATACAATGATTCTAATCGTAAAATTTCCAAAGTTGAAGTCA 
AAAT T GGT AAAG CT G AT G C T AAG AC AC AAG AAAT T T T AT C AG GT T T G AAA 
G C AG GAC AAAT C GT GGT T AC T AAT C C AAG C AAAAC T T T C AAG GAT GGG C A 
AAAAAT T GAT AAT AT T GAAT C AAT AGAT CT T AAGT CT AAT AAGAAAT CAG 
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AGGTGAAA 

SEQ ID NO. 8504 

STRAIN H3 6B 

T T T T TAT G G GT AC AAT C T C AAC C T AAT AAG AGT G C AGT AAAAAC T AAT T A 
C AAAGT T T T T AAT GT T AGAG AAG G AAGT GTTTCGTCCT C AAC T C T T T T GA 
CAGGAAAAGCTAAGGCTAATCAAGAACAGTATGTGTATTTTGATGCTAAT 
AAGGGTAATCGAGCAACTGTTACAGTTAAAGTGGGTGATAAAATCACAGC 
T G G T C AG C AGT T AGT T C AAT AT G AT AC AAC AACT G C AC AAGC AG C C T AC G 
ACACTGCTAATCGTCAATTAAATAAAGTAGCGCGTCAGATTAATAATCTA 
AAGAC AAC AGG AAG T C T T C C AGC T AT GGAAT C AAGT GAT C AAT CT T CAT C 
ATCATCACAAGGACAAGGGACTCAATCGACTAGTGGTGCGACGAATCGTC 
TAG AG C AAAAT TAT C AAAGT C AAG CT AAT GC T T CAT AC AAC C AAC AACT T 
C AAG AT T T GAAT GAT G CT TAT GC AG AT G C AC AGG C AGAAG T AAAT AAAG C 
AC AAAAAGC AT T G AAT GAT AC T GT T AT T AC AAGT G AC GT AT C AGGGAC AG 
T T GT T G AAGT T AAT AGT GAT AT T GAT C C AGC T T C AAAAAC T AGT C AAGT A 
CTTGTCCATGTAGCAACTGAAGGTAAACTCCAAGTACAAGGAACGATGAG 
TGAGTATGATTTGGCTAATGTAAAAAAAGACCAGGCTGTTAAAATAAAAT 
CTAAGGTCTATCCTGACAAGGAATGGGAAGGTAAAATTTCATATATCTCA 
AAT TAT C C AGAAG C AGAAG C AAAC AAC AAT G ACT C T AAT AAC GG CT C TAG 
TGCTGTAAATTATAAATATAAAGTAGATATTACTAGCCCTCTCGATGCAT 
T AAAAC AAGG T T T T AC T G T AT C AGT T G AAGT AGT T AAT GG AG AT AAGC AC 
C T T AT T GT T C CT AC AAGT T C T GT G AC AAAC AAAG AT AAT AAAC ACT T T GT 
T T GGGT AT AC AAT GAT T C T AAT C GT AAAAT T T C C AAAGT T G AAGT C AAAA 
T T G GT AAAG C T GAT G C T AAGAC AC AAG AAAT T T T AT C AGGT T T G AAAG C A 
G GAC AAAT CG T AGT TACT AAT C C AAG T AAAG C T T T C AAG GAT GG G C AAAA 
AAT T GAT AAT AT T GAAT C AAT C GAT C T T AAGT C T AAT AAG AAAT C AG AG G 
TG 

SEQ ID NO. 8505 

STRAIN 18RS21 

T T T T T AT G G GT AC AAT CT C AAC C T AAT AAG AGT G C AGT AAAAAC T AACT A 
C AAAGT TT TT AAT GTT AGAGAAGG AAGT GTTTCGTCCTC AACT CTTTTG A 
C AG GAAAAG CT AAGG C T AAT C AAG AAC AGT AT GT GT ATT T T G ATG C T AAT 
AAAG GT AAT C G AGC AACT G T C AC AG T T AAAGT GG GT GAT AAAAT C AC AG C 
T G G T C AG C AGT T AGT T C AAT AT GAT AC AAC AAC T GC AC AAG C AGC CT AC G 
ACACTGCTAATCGTCAATTAAATAAAGTAGCGCGTCAGATTAATAATCTA 
AAGAC AAC AG G AAG T C T T C C AG C T AT GGAAT C AAGT GAT C AAT C T T CT T C 
AT CAT C AC AAG GAC AAG G GAC T C AAT CG ACT AGT GGT G C GAC GAAT C G T C 
T AC AG C AAAAT TAT C AAAGT CAAGCT AAT G CT T CAT AC AAC C AAC AAC T T 
C AAG AT T T GAAT GAT G C T T AT G C AG AT G C AC AGG C AG AAG T AAAT AAAG C 
ACAAAAAGCATTGAAT GAT ACTGTT ATT ACAAGTGACGT AT C AGGGAC AG 
TTGTTGAAGTTAATAGTGATATTGATCCAGCTTCAAAAACTAGTCAAGTA 
CT T GT C C AT GT AG C AACT GAAGGT AAAC T C C AAGT AC AAGG AAC GAT GAG 
T G AGT AT GAT T T GG C T AAT GT T AAAAAAG AC C AG GC T GT T AAAAT AAAAT 
CTAAGGTCTATCCTGACAAGGAATGGGAAGGTAAAATTTCATATATCTCA 
AAT TAT C C AGAAG C AG AAGC AAAC AAC AAT GAC T C T AAT AAC G G C T C T AG 
T G C T GT AAAT TAT AAAT AT AAAGT AG AT AT T AC TAG C C C T C T C GAT G CAT 
TAAAACAAGGTTTTACCGTATCAGTTGAAGTAGTTAATGGAGATAAGCAC 
CTTATTGTCCCTACAAGTTCTGT GAT AAACAAAGAT AAT AAAC ACT TTGT 
T T G GGT AT AC AAT GAT T C T AAT C GT AAAAT T T C C AAAG T T GAAGT C AAAA 
T T G GT AAAG CT GAT G CT AAG AC AC AAG AAAT T T T AT C AG GT T T G AAAG C A 
GGAC AAAT CGTGGTT ACT AAT CCAAGT AAAAC CTTCAAGGATGGGCAAAA 
AAT T GAT AAT AT T GAAT C AAT C GAT CT T AAC T C T AAT AAG AAAT C AG AG 

SEQ ID NO. 8506 

STRAIN M732 

TTTTTATGGGTACAATCTCAACCTAATAAGAGTGCAGTAAAAACTAATTA 
CAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCTTTTGA 
C AG GAAAAG C T AAG G C T AAT C AAG AAC AG TAT GT GT AT T T T GAT G C T AAT 
AAAGGTAATCGAGCAACTGTTACAGTTAAAGTGGGTGATAAAATCACAGC 
TGGTCAGCAGTTAGTTCAATATGATACAACAACTGCACAAGCAGCCTACG 
AC AC T G C T AAT C GT C AAT T AAAT AAAG TAG CG C G T C AG AT T AAT AAT C T A 
AAG AC AAC AG GG AGT T T T C C AGCT AT G GAAT C AAGT GAT C AAT C T T CAT C 
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ATCATCACAAGGACAAGGGACTCAATCGACTAGTGGTGCGACGAATCGTC 
TACAGCAAAAT TAT CAAAGT C AAGCTAAT GCTT CATACAACC AACAACTT 
C AAG AT T T G AAT GAT G C T T AT G C AG AT G C AC AGG C AG AAGT AAAT AAAG C 
ACAAAAAGCATTGAATGATACTGTTATTACAAGTGACGTATCAGGGACAG 
TTGTTGAAGTTAATAGTGATATTGATCCAGCTTCAAAAACTAGTCAAGTA 
CT T GT C CAT G T AGC AAC T G AAG GT AAAC T C C AAGT AC AAGG AAC GAT GAG 
TGAGTATGATTTGGCTAATGTTAAAAAAGATCAGGCTGTTAAAATAAAAT 
CTAAGGTCTATCCTGACAAGGAATGGGAAGGTAAAATTTCATATATCTCA 
AAT TAT C C AG AAG C AGAAG C AAAC AAC AAT G AC T C T AAT AACGG CT CT AG 
T G CT GT AAAT TAT AAAT AT AAAGT AG AT AT TACT AG C C C T C T C GAT G C AT 
T AAAAC AAGGT T T T AC C GT AT C AGT T G AAGT AG T T AAT G GAG AT AAG C AC 
CTTATTGTCCCTACAAGTTCTGTGATAAACAAAGATAATAAACACTTTGT 
T T G G GT AT AC AAT GAT T C T AAT CGT AAAAT T T C CAAAGT T G AAG T C AAAA 
T T GGT AAAG C T G AT GC T AAG AC AC AAG AAAT T T T AT C AG GT T T G AAAG C A 
GG AC AAAT C GT GGT T AC T AAT C C AAG C AAAAC T T T C AAG GAT G G G C AAAA 
AAT T GAT AAT AT T GAAT C AAT C GAT C T T AAGT C T AAT AAGAAAT C AG AG G 
TGAA 

SEQ ID NO. 8507 

STRAIN COH1 

T T T T T AT GGG T AC AAT CT C AAC CT AAT AAGAG T G C AGT AAAAAC 
TAATTACAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTC 
T T T T G AC AG G AAAAG C T AAG GC T AAT C AAG AAC AGT AT GT GT AT T T T GAT 
GCT AAT AAAGGT AAT CG AGC AACTGTT AC AGT T AAAGT GGGTGAT AAAAT 
C AC AG C T G GT C AG C AGTT AGT T C AAT AT GAT AC AAC AAC T G C AC AAG C AG 
C CT AC G AC ACT GCT AAT CGT C AAT T AAAT AAAGT AG C G C GT C AGAT T AAT 
AAT C T AAAG ACAAC AG GGAGT T T T C C AG C TAT G G AAT C AAGT GAT C AAT C 
TTCATCATCATCACAAGGACAAGGGACTCAATCGACTAGTGGTGCGACGA 
AT CGT C TACAGCAAAAT TAT CAAAGT C AAGC T AAT GCTT CAT AC AAC C AA 
C AAC T T C AAG AT T T GAAT GAT GCT TAT G C AG AT G C AC AG GC AG AAGT AAA 
T AAAGCACAAAAAGC ATT GAATGAT ACT GT T AT T ACAAGTGACGT AT CAG 
GGACAGTTGTTGAAGTTAATAGTGATATTGATCCAGCTTCAAAAACTAGT 
C AAG TACT T GT C C AT G T AG C AAC T G AAG GT AAACT C C AAGT AC AAGG AAC 
GATGAGT GAGTATGAT T T GGCT AAT GTTAAAAAAGATC AGGCT GTT AAAA 
T AAAAT C T AAGG T C T AT C C T G AC AAG GAAT GG GAAG GT AAAAT T T CAT AT 
AT CT C AAAT TAT C CAG AAG CAG AAG C AAAC AAC AAT G AC T C T AAT AACGG 
CTC T AGT G CT GT AAAT TAT AAAT AT AAAGT AG AT AT T AC TAG CCCTCTCG 
AT G C AT T AAAAC AAG G T T T T AC C GT AT C AGT T GAAG TAG T T AAT G GAG AT 
AAGCACCTTATTGTCCCTACAAGTTCTGTGATAAACAAAGATAATAAACA 
CTTTGTTTGGGTATACAATGATTCTAATCGTAAAATTTCCAAAGTTGAAG 
T C AAAAT T GGT AAAG C T GAT G C T AAGAC AC AAG AAAT T T TAT C AG GT T T G 
AAAG C AGG AC AAAT C GT GGT T AC T AAT C C AAG C AAAACT T T C AAG GAT G G 
GC AAAAAAT T GAT AAT AT T GAAT C AAT C G AT C T T AAGT C T AAT AAG AAAT 
CAGAGGTGAA 

SEQ ID NO. 8507 

STRAIN M7 81 * 

TTTTTATGGGTACAATCTCAACCTAATAAGAGTGCAGTAAAAACTAATTA 
CAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCTTTTGA 
CAGGAAAAGCTAAGGCTAATCAAGAACAGTATGTGTATTTTGATGCTAAT 
AAAGGT AAT CGAGC AACTGTT ACAGTT AAAGT GGGTGAT AAAAT CAC AGC 
T G GT C AG C AGT T AGT T C AAT AT GAT AC AAC AAC T G C AC AAG CAG C CT AC G 
ACACTGCTAATCGTCAATTAAATAAAGTAGCGCGTCAGATTAATAATCTA 
AAGAC AAC AGG G AGT T T T C CAG C T AT G GAAT C AAGT GAT C AAT C T T CAT C 
AT CAT CAC AAG G AC AAG G G ACT C AAT C G AC T AGT GG T G CG ACG AAT C GT C 
T AC AG C AAAAT TAT CAAAGT C AAG CT AAT GCTT C AT AC AAC C AAC AACT T 
C AAG AT T T GAAT GAT G CT T AT GC AGAT G CAC AGG C AGAAG T AAAT AAAG C 
AC AAAAAG CAT T GAAT GAT AC T GT TAT T AC AAGT GAC GT AT C AG GG AC AG 
TTGTTGAAGTTAATAGTGATATTGATCCAGCTTCAAAAACTAGTCAAGTA 
C TT G T C C AT GT AG C AACT GAAG G T AAAC T C C AAGT AC AAG G AAC GAT GAG 
TGAGTATGATTTGGCTAATGTTAAAAAAGATCAGGCTGTTAAAAT AAAAT 
C T AAG G T CT AT C C T GAC AAGG AAT G GG AAG GT AAAAT T T CAT AT AT C T C A 
AAT TAT C CAG AAG CAG AAG C AAAC AAC AAT GAC T CT AAT AAC GGCT C T AG 
TGCTGTAAATTATAAATATAAAGTAGATATTACTAGCCCTCTCGATGCAT 
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TAAAACAAGGTTTTACCGTATCAGTTGAAGTAGTTAATGGAGATAAGCAC 
CTT ATTGT CC CTACAAGT T CTGTGAT AAACAAAGATAAT AAAC ACTT T GT 
TTGGGTATACAATGATTCTAATCGTAAAATTTCCAAAGTTGAAGTCAAAA 
T T G GT AAAG C T GAT G C T AAG AC AC AAG AAAT T T TAT C AGGT T T GAAAG C A 
GGACAAATCGTGGTTACTAATCCAAGCAAAACTTTCAAGGATGGGCAAAA 
AATTGAT AATATTGAAT CAAT CGAT CTT AAGT CTAAT AAGAAAT CAGAGG 
TGAA 

SEQ ID NO. 8508 

STRAIN CJB110 

T T T T T AT G G GT AC AAT CT C AAC CT AAT AAGAGT GC AGT AAAAACT AAC T A 
CAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCTTTTGA 
CAGGAAAAGCTAAGGCTAATCAAGAACAGTATGTGTATTTTGATGCTAAT 
AAAGGTAATCGAGCAACTGTCACAGTTAAAGTGGGTGATAAAATCACAGC 
TGGTCAGCAGTTAGTTCAATATGATACAACAACTGCACAAGCAGCCTACG 
AC ACT G C T AAT C GT CAAT T AAAT AAAG TAG C G C GT C AG AT T AAT AAT CT A 
AAGAC AAC AGGAAGT CTT CCAGCTATGGAATTAAGT GAT CAAT CTT CTT C 
ATCATCACAAGGACAAGGGACTCAATCGACTAGTGGTGCGACGAATCGTC 
TAG AG C AAAAT TAT C AAAGT C AAG C T AAT G C T T CAT AC AAC C AAC AACT T 
C AAGAT T T G AAT GAT G CT T AT G C AGAT G C AC AG GC AG AAGT AAAT AAAG C 
AC AAAAAG CAT T GAATG AT AC T G T TAT T AC AAGT GAC GT AT C AG GGAC AG 
TTGTTGAAGTTAATAGTGATATTGATCCAGCTTCAAAAACTAGTCAAGTA 
CTT GT C C AT GT AGC AACT GAAG GT AAAC T C C AAGT AC AAGGAAC GAT GAG 
TGAGTATGATTTGGCTAATGTTAAAAAAGACCAGGCTGTTAAAATAAAAT 
C T AAG GT CT AT C CT GAC AAG G AAT GG GAAGG T AAAAT T T CAT AT AT C T C A 
AAT TAT C C AG AAG C AG AAGC AAAC AAC AAT G ACT C T AAT AACG G C T CT AG 
TGCTGTAAATTAT AAAT AT AAAGT AGAT ATT ACT AGCCCTCT CGAT G CAT 
TAAAACAAGGTTTTACCGTATCAGTTGAAGTAGTTAATGGAGATAAGCAC 
CTT AT TGTCCCT AC AAGT T CTGTGAT AAACAAAGATAAT AAAC ACTT TGT 
TTGGGTATACAATGATTCTAATCGTAAAATTTCCAAAGTTGAAGTCAAAA 
T T GGT AAAG C T GAT G CT AAG AC AC AAG AAAT T T TAT C AG GT T T GAAAG C A 
GGACAAATCGTGGTTACTAATCCAAGTAAAACCTTCAAGGATGGGCAAAA 
AAT T GAT AAT AT T G AAT CAAT C GAT C T T AACT C T AAT AAG AAAT C AG AG G 
TGA 

SEQ ID NO. 8509 

STRAIN 1169NT 

T T T T T AT GG GT AC AAT CT C AAC C T AAT AAG AG T G C AG T AAAAAC T 
AACTACAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCT 
T T T G AC AGGAAAAGC T AAG G C T AAT C AAG AAC AGT AT G T G T AT T T T GAT G 
CTAAT AAAGGT AAT C GAG C AACT GT C AC AGT T AAAG T GG GT GAT AAAAT C 
AC AG CT G GT C AG C AGT TAG T T CAAT AT GAT AC AAC AAC T G C AC AAG C AG C 
CTACGACACTGCTAATCGTCAATTAAATAAAGTAGCGCGTCAGATTAATA 
AT CT AAAGAC AAC AG G AAGT CT T C C AG C TAT GG AAT C AAGT GAT CAAT CT 
T CT T CAT CAT C AC AAGG AC AAGG G AC T CAAT CGAC T AGT GGT GC G AC G AA 
T C GT CT AC AG C AAAAT TAT C AAAGT C AAG C T AAT G C T T CAT AC AAC C AAC 
AACTTCAAGATTTGAATGATGCTTATGCAGATGCACAGGCAGAAGTAAAT 
AAAGCACAAAAAGCATTGAATGATACTGTTATTACAAGTGACGTATCAGG 
GAC AGT T G T T GAAGT T AAT AGT G AT AT T GAT C C AG CT T C AAAAAC T AGT C 
AAGT AC T T GT C CAT GT AGC AAC T G AAG GT AAAC T C C AAGT AC AAGG AAC G 
AT G AGT GAG TAT GAT T T G G C T AAT GT T AAAAAAG AC C AG G C T GT T AAAAT 
AAAAT CT AAGG T C T AT C C T GAC AAGG AATG GGAAGGT AAAAT T T CAT AT A 
T C T C AAAT TAT C C AGAAG CAGAAG C AAAC AAC AAT GAC T C T AAT AAC GG C 
T C T AGT G C T GT AAAT TAT AAAT AT AAAG TAG AT AT T AC TAG C C CT C T CG A 
T G CAT T AAAAC AAG GT T T T AC C GT AT C AG T T GAAG TAG T T AAT G GAG AT A 
AG C AC CT T AT TGT C CC T AC AAG T T CT GT GAT AAAC AAAG AT AAT AAAC AC 
TTTGTTTGGGTATACAATGATTCTAATCGTAAAATTTCCAAAGTTGAAGT 
C AAAAT T G GT AAAG CT G AT G C T AAG AC AC AAG AAAT T T TAT C AG GT T T G A 
AAG C AGG AC AAAT C GT GG T T AC T AAT C C AAGT AAAAC CTT C AAG GAT G GG 
C AAAAAAT T GAT AAT AT T G AAT CAAT C GAT CT T AACT C T AAT AAG AAAT C 
AGAGGTGAA 

SEQ ID NO. 8510 

STRAIN JM9130013 
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T T T T TAT G G GT AC AAT CT C AAC C T AAT AAGAGT G C AGT AAAAACT AACT A 
CAAAGTTTTTAATGTTAGAGAAGGAAGTGTTTCGTCCTCAACTCTTTTGA 
C AGGAAAAGCTAAGGCTAAT CAAGAAC AGT AT GTGTATTTT GATGCT AAT 
AAAGGTAATCGAGCAACTGTTACAGTTAAAGTGGGTGATAAAATCACAGC 
TGGTCAGCAGTTAGTTCAATATGATACAACAACTGCACAAGCAGCCTACG 
ACACTGCTAATCGTCAATTAAATAAAGTAGCGCGTCAGATTAATAATCTA 
AAGACAACAGGAAGT CTT CCAGCTATGGAAT CAAGT GATCAAT CT T CATC 
AT CAT CAC AAGG AC AAG G G GC T C AAT CGACT AGT G GT GCG ACG AAT CGT C 
T ACAGCAAAATTAT CAAAGTCAAGCT AAT GCT T CAT AC AAC C AAC AACT T 
C AAG AT T T G AAT GAT G CT T AT GC AG AT GC AC AG G C AGAAGT AAAT AAAG C 
ACAAAAAGCATTGAATGATACTGTTATTACAAGTGACGTATCAGGGACAG 
T TGTTGAAGTT AAT AGTGAT ATTGAT CCAGCT T CAAAAACT AGT CAAGT A 
C T T GT C C AT GT AG C AAC T GAG G GT AAAC T C C AAG TAG AAGG AAC GAT GAG 
TGAGTATGATTTGGCTAATGTTAAAAAAGACCAGTCTGTTAAAATAAAAT 
C T AAGGT CT AT C CT G AC AAG G AAT GG G AAGGT AAAAT T T CAT AT AT C T C A 
AAT TAT C C AG AAG C AG AAG C AAAC AAC AAT G AC T CT AAT AACGG C T C T AG 
TGCTGTAAATTATAAATATAAAGTAGATATTACTAGCCCTCTCGATGCAT 
T AAAAC AAGG T T T T AC T GT AT C AG T T G AAGT AGT T AAT G GAGAT AAGC AC 
CTTATTGTTCCTACAAGTTCTGTGACAAACAAAGATAATAAACACTTTGT 
T T G GGT AT AC AAT GAT T CT AAT C G T AAAAT T T C C AAAGT T G AAGT C AAAA 
T T G GT AAAGC T GAT GCT AAGAC AC AAGAAAT T T T AT C AGGT T T GAAAG C A 
GGACAAATCGTGGTTACTAATCCAAGCAAAACTTTCAAGGATGGGCAAAA 
AAT T GAT AAT AT T G AAT C AAT AG AT C T T AAGT CT AAT AAGAAAT C AG AGG 
TGAAA 

SEQ ID NO. 8511 
STRAIN 2 603 frame: 1 

MSKRQNLGISKKGAIISGLSVALIVVIGGFLWVQSQPNKSAVKTNYKVFNVREGSVSSST 
LLTGKAPCANQEQYVYFDANKGNRATVTVKVGDKITAGQQLVQYDTTTAQAAYDTANRQLN 
KVARQ INNLKT T GS LPAME S S DQS S S SSQGQGTQSTSGATNRLQQNYQSQANAS YNQQLQ 
DLNDAYADAQAEVNKAQKALNDTVITSDVSGTWEVNSDIDPASKTSQVLVHVATEGKLQ 
VQGTMSEYDLANVKKDQAVKIKSKVYPDKEWEGKI S YI SNYPEAEANNNDSNNGS S AVNY 
KYKVDITSPLDALKQGFTVSVEVVNGDKHLIVPTSSVINKDNKHFVWVYNDSNRKISKVE 
VKIGKADAKTQEILSGLKAGQIVVTNPSKTFKDGQKIDNIESIDLNSNKKSEVK 

SEQ ID NO. 8512 

STRAIN 090 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGSLPAMELSDQSSSSSQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITSPLDALKQGFTVSVEWNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIWTNPSK 
TFKDGQKIDNIESIDLNSNKKSE 

SEQ ID NO. 8513 

STRAIN A909 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGSLPAMES S DQSSSSSQ 
GQGAQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQSVKIKSKVYPDK 
EWEGKISYI SNYPEAEANNNDSNNGS SAVNYKYKVDITSPLDALKQGFTVSVEVVNGDKH 
LIVPTSSVTNKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIVVTNPSK 
TFKDGQKIDNIESIDLKSNKKSEVK 

SEQ ID NO. 8514 

STRAIN H36B frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGSLPAMESSDQSSSSSQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTVVEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITSPLDALKQGFTVSVEVVNGDKH 
LIVPTSSVTNKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIVVTNPSK 
AFKDGQKI DNIE S I DLKSNKKS EV 
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SEQ ID NO. 8515 

STRAIN 18RS21 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGSLPAMESSDQSSSSSQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITSPLDALKQGFTVSVEWNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIVVTNPSK 
TFKDGQKIDNIESIDLNSNKKSE 

SEQ ID NO. 8516 

STRAIN M732 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKI T AGQQLVQ Y DTT T AQAAYDT ANRQLNKVARQ INN LKTT G S FPAMES SDQS S S S SQ 
GQGT Q S T S G ATNRLQQN YQ S QAN A S YN QQLQ DLN D AY AD AQ AE VN KAQKALN DT VI T S D V 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITSPLDALKQGFTVSVEWNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIWTNPSK 
T FKDGQKI DN IE S I DLKSNKKS E V 

SEQ ID NO. 8517 

STRAIN COH1 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGS FPAMES SDQS S S S SQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITSPLDALKQGFTVSVEWNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIVVTNPSK 
T FKDGQKI DNIES I DLKSNKKSEV 

SEQ ID NO. 8518 

STRAIN M7 81 frame: 1 

FLW VQ S Q PNKS AVKTN YKV FN VRE G SVSSSTLLTG KAKAN QE Q Y V Y F D ANKGNRAT VT VK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGS FPAMES S DQS S S S SQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
S GT WEVNS D I D PASKTS QVLVHVATEGKLQVQGTMSE YDLANVKKDQAVKIKS KVY P DK 
EWEGKI S YI SNYPEAEANNNDSNNGS S AVNYKYKVDITS PLDALKQGFTVSVEVVNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIVVTNPSK 
T FKDGQKI DNIES I DLKSNKKSEV 

SEQ ID NO. 8519 

STRAIN M7 81 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGS FPAMES S DQS S S S SQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSS AVNYKYKVDITS PLDALKQGFTVSVEVVNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIWTNPSK 
T FKDGQKI DNIES I DLKSNKKSEV 

SEQ ID NO. 8520 

STRAIN CJB110 frame: 1 

FLWVQSQ PNKS AVKTN YKV FN VRE GSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTT AQAAYDT ANRQLNKVARQINNLKTTGSLPAMELS DQS S S S SQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSS AVNYKYKVDITS PLDALKQGFTVSVEVVNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIVVTNPSK 
TFKDGQKIDNIESIDLNSNKKSEV 

SEQ ID NO. 8521 

STRAIN 1169NT frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 



372 



WO 2004/018646 



PCT/US2003/026827 



SEQUENCE LISTING 



VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGSLPAMESSDQSSSSSQ 
GQGTQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQAVKIKSPCVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITSPLDALKQGFTVSVEWNGDKH 
LIVPTSSVINKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLBCAGQIWTNPSK 
TFKDGQKIDNIESIDLNSNKKSEV 

SEQ ID NO. 8522 

STRAIN JM9130013 frame: 1 

FLWVQSQPNKSAVKTNYKVFNVREGSVSSSTLLTGKAKANQEQYVYFDANKGNRATVTVK 
VGDKITAGQQLVQYDTTTAQAAYDTANRQLNKVARQINNLKTTGSLPAMES S DQS S S S SQ 
GQGAQSTSGATNRLQQNYQSQANASYNQQLQDLNDAYADAQAEVNKAQKALNDTVITSDV 
SGTWEVNSDIDPASKTSQVLVHVATEGKLQVQGTMSEYDLANVKKDQSVKIKSKVYPDK 
EWEGKISYISNYPEAEANNNDSNNGSSAVNYKYKVDITSPLDALKQGFTVSVEWNGDKH 
LIVPTSSVTNKDNKHFVWVYNDSNRKISKVEVKIGKADAKTQEILSGLKAGQIWTNPSK 
T FKDGQKI DNI ES I DLKSNKKSE VK 

SEQ ID NO. 8601 
STRAIN 2603 

atgaaaaaaattggaattattgtcctcacactactgaccttctttttggtatcttgcgga 
caacaaactaaacaagaaagcactaaaacaactatttctaaaatgcctaaaattgaaggc 
ttcacctattatggaaaaattcctgaaaatccgaaaaaagtaattaattttacatattct 
tacactgggtatttattaaaactaggtgttaatgtttcaagttacagtttagacttagaa 
aaagatagccccgtttttggtaaacaactgaaagaagctaaaaaattaactgctgatgat 
acagaagctattgccgcacaaaaacctgatttaatcatggttttcgatcaagatccaaac 
atcaatactctgaaaaaaattgcaccaactttagttattaaatatggtgcacaaaattat 
ttagatatgatgccagccttggggaaagtattcggtaaagaaaaagaagctaatcagtgg 
gttagccaatggaaaactaaaactctcgctgtcaaaaaagatttacaccatatcttaaag 
cctaacactacttttactattatggatttttatgataaaaatatctatttatatggtaat 
aattttggacgcggtggagaactaatctatgattcactaggttatgctgccccagaaaaa 
gtcaaaaaagatgtctttaaaaaagggtggtttaccgtttcgcaagaagcaatcggtgat 
tacgttggagattatgcccttgttaatataaacaaaacgactaaaaaagcagcttcatca 
cttaaagaaagtgatgtctggaagaatttaccagctgtcaaaaaagggcacatcatagaa 
agtaactacgacgtgttttatttctctgaccctctatctttagaagctcaattaaaatca 
tttacaaaggctatcaaagaaaatacaaat 

SEQ ID NO. 8602 

STRAIN 090 

G AAG G C T T C AC C T AT TAT GG AAAAAT T C CT GAAAAT C C GAAAAAAGT AAT 
TAATTTTACATATTCTTACACTGGGTATTTATTAAAACTAGGTGTTAATG 
T T T C AAGT T AC AGT T TAG AC T T AG AAAAAG AT AG CCCCGTTTTT GGT AAg 
CAACTGAAAGAAGCTAAAAAATTAACTGCTGATGATACAGAAGCTATTGC 
CG C AC AAAAAC CT GAT T T AAT CAT GGT T T T C GAT C AAG AT C C AAAC AT C A 
ATACTCTGAAAAAAATTGCACCAACTTTAGTTATTAAAtATGGTGCACAA 
AAT TAT T TAG AT AT GAT G C C AGC C T T G GGG AAAGT AT T C GGT AAAGAAAA 
AGAAGCTAATCAGTGGGTTAGCCAATGGAAAACTAAAACTCTCGCTGCCA 
AAAAAG AT T T AC AC CAT AT C T T AAAG C C T AAC ACT AC T T T TAG TAT TAT G 
GATTTTTATGATAAAAATATCTATTTATATGGTAATAATTTTGGACGCGG 
t G G AG AACT AAT CT AT GAT T C ACT AGG T TAT G CT G C C C C Ag AAAAAGT C A 
AAAAAgATGTcTTTAAAAAAGGGTGGTTTACCGTTTCgCAAGAAGCAATC 
G G t GAT T ACG T T G GAG AT TAT GCCCTTGTT AAT AT AAAC AAAAC G ACT AA 
AAAAG C AGCT T C a t c AC T T AAAG AAAG T GAT GT C T GG AAG AAT T T AC C AG 
CTGTCaAAAAAGGGCACATCATAGAAAGTAacTACGACGTGTTTTATTTC 
TCTGACCCTCTATCTTTAGAAGCTCAATTAAAATCATTTACAAA 

SEQ ID NO. 8603 

STRAIN A90 9 

GAAGGCTTCACCTATTATGGAAAAATTCCTG 

AAAAT CCGAAAAAAGT AAT T AAT TT T ACAT AT T CTT AC ACT GGAT ATT T A 
T T AAAAC T AGG AGT T AAT GT T T C AAGT T AC AG T T TAG AC T T AG AAAAAGA 
TAgCCCCGTTTTTGGTAAaCAACTGAAAGGAGCTAAAAAATTAACTGCTG 
AT GAT AC AG AAG CT AT T G C C G C AC AAAAAC C T GAT T T Aa T CAT G GT T T T T 
GAT CAAGAT CC AAACAT C AAT ACT CT GAAAAAAATTGC ACCAACTTTAGT 
TAT T AAAT AT GGT G C AC AAAAT TAT T T Ag AT a T GAT G C C AG C T T T G GG G A 
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AAGTATTCGGTAAAGAAAAAGAAGCTAATCAGTGGGTTAGCCAaTGGAAA 
ACT AAAAC TCTCGCTGC C AAAAAAG AT T T AC AC C AT AT CT T AAAAC CT AA 
CACT ACT T T TACCAT TAT GGAT T T TT AT GAT AAAAAT ATCT AT T TAT AT G 
GT AAT AAT T T T G G AC G CG GT GG AG AAC T AAT C TAT GAT T C AC TAG G T TAT 
GCTGCCCCAGAAAAAGTCAAAAAAGATGTCTTTAAAAAAGGGTGGTTTAC 
CGTTTCGCAAGAAGCAATCGGTgATTACGTTGGAGATTATGCCCTTGTTA 
AT AT AAAC AAAAC G AC T AAAAAAG C AG CT T CAT C ACT T AAAGAAAG T GAT 
GTCTGGAAGAATTTACCAGCTGTCAAAAAAGGGCACATCATAGAAAGTAA 
CTACGACGTGTTTTATTTCTCTGACCCTcTATCTTTAGAAGCTCAATTAA 
AAT CAT T TAG AAA 

SEQ ID NO. 8604 

STRAIN H3 6B 

G AAGG C T T C AC C TAT TAT G G AAAA 

AT T C CT G AAAAT C C G AAAAAAGT AAT T AAT T T T AC AT AT T C T T AC AC T GG 
ATATTTATTAAAACTAGGAGTTAATGTTTCAAGTTACAGTTTAGACTTAG 
AAAAAGATAgCCCCGTTTTTGGTAAgCAACTGAAAGGAGCTAAAAAATTA 
ACT G CT GAT G AT AC AG AAG C T AT T G CC G C AC AAAAAC CT GAT T T Aa T CAT 
GGT T T T T GAT C AAgAT C C AAAC AT C AAT AC T CT GAAAAAAAT T G C AC C AA 
CTTTAGTTATTAAATATGGTGCACAAAATTATTTAgATaTgATGCCAGCT 
TTGGGGAaAGTATTCGGTAAAGAAAAAGAAGCTAATCAGTGGGTTAGCCA 
ATGGAAAACTAAAACTCTCGCTGCCAAAAAAGATTTACACCATATCTTAA 
GGC CT a AC Ac TACT T TT ACT AT T AT AGA t T T T TAT GAT AAAAAT AT CT AT 
T TAT AT GGT AAT AAT T T T GG AC G CG G t G G Ag AAC T AAT C T AT GAT t C AC T 
AGGTTATGCTGCCCCAgAAAAAGTCAAAAAAgATGTCTTTAAAAAAGGGT 
GGT T T AC C G T T T C g C AAG AAG C AAT CG G T g ATT ACGT T GG AG AT TAT G C C 
C T T GT T AAT AT AAAC AAAAC G AC T AAAAAAG C AG C T T C a T C AC T T AAAGA 
AAGTGATGTT T GGAAGAAT TT AC CAGCTGT CAAAAAAGGGCACAT CATAG 
AAAGT AACT AC G AC GT GT T T TAT T T C T C T GAC C C T C TAT CT T TAG AAG CT 
C AATT AAAAT CATTT AC AAA 

SEQ ID NO. 8605 

STRAIN 18RS21 
GAAGGCTTCACCTATTATGGA 

AAAATT CCT G AAAAT CCGAAAAAAGT AAT T AATTTT ACAT ATT CT TACAC 
TGGGTATTTATTAAAACTAGGTGTTAATGTTTCAAGTTACAGTTTAGACT 
T AGAAAAAGAT AGC CCCGTT TT T GGT AAAC AAC TGAAAG AAG CT AAAAAA 
TTAACTGCTGATGATACAGAAGCTATTGCCGCACAAAAACCTGATTTAAT 
CATGGTTTT CGAT CAAGAT CCAAACAT C AAT ACT CT GAAAAAAAT T G CAC 
CAACTTTAGTT ATT AAATATGGTGC AC AAAATT ATTTAg AT aTGATGCCA 
GCCTTGGGGAAAGTATTCGGTAAAGAAAAAgAAGCTAATCAGTGGGTTAG 
CCAATGGAAAACTAAAACTCTCGCTGTC AAAAAAG ATTT AC ACC AT ATCT 
T AAAG C C T AAC AC TACT T T T AC T AT TAT GGAT T T T TAT GAT AAAAAT AT C 
TATTTATATGGTAATAATTTTGGACGCGGTGGAGAACTAATCTATGATTC 
ACT AGGTT AT GCTGCCCCAgAAAAAGTCAAAAAAgATGTCTTT AAAAAAG 
GGTGGTTTACCGTTTCGCAAGAAGCAATCGGTGATTACGTTGGAGATTAT 
GCC CTT GTT AAT ATAAACAAAACgACT AAAAAAGCAGCTTC AT C ACT T AA 
AG AAAGT GAT GT C T G G AAGAAT T T AC C AG C T GT C AAAAAAG G G CAC AT C A 
TAGAAAGTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAGAA 
G C T C AAT T AAAAT CAT T T AC AAA 

SEQ ID NO. 8606 

STRAIN M732 

G AAGG C T T CAC C T AT T ATG G 

AAAAAT T CCT GAAAAT CCGAAAAAAGT AATT AATTT T AC AT ATT CTT ACA 
CTGGGTATTTATTAAAACTAGGTGTTAATGTTTCAAGTTACAGTTTAGAC 
TT AGAAAAAGAT AGC CCCGTT TTTGGT AAGC AACTGAAAGAAGCT AAAAA 
ATT AACT GCT GAT GAT AC AGAAGCT ATT GCCGC AC AAAAAC CTGATTTAA 
T CATGGTTTT CGAT CAAGAT CCAAAC AT CAAT ACT CTGAAAAAAATTGCA 
CCAACTTTAGTTATTAAATATGGTGCACAAAATTATTTAgATATGATGCC 
AGCCTTGGGGAAAGTATTCGGTAAAGAAAAAGAAGCTAATCAGtGGGTTA 
GCCAATGGAAAACTAAAACTCTCGCTGCCAAAAAAGATTTACACCATATC 
T T AAAG C C T AAC AC TAG T T T T AC T AT T AT G GAT T T T TAT GAT AAAAAT AT 
CTATTTATATGGTAATAATTTTGGACgCGGtGGAgAACTAATCTATGATT 



374 



WO 2004/018646 



PCT/US2003/026827 



SEQUENCE LISTING 



CACTAGGTTATGCTGCCCCAGAAAAAGTCAAAAAAGATGTCTTTAAAAAA 
GGGTGGTTTACCGTTTCGCAAGAAGCAATCGGTGATTACGTTGGAGATTA 
TGCCCT T GTT AATAT AAAC AAAACGACT AAAAAAGCAGCT T CAT CACTT A 
AAGAAAGTGAT GT CT GGAAGAAt T T AC C AGCTGTCAAAAAAGGGCAC AT C 
ATAGAAAGTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAGA 
AGCT CAATTAAAAT CAT TTACAAA 

SEQ ID NO. 8607 

STRAIN COH1 

GAAGG CT T C AC CT AT T AT G 

GAAAAATTCCTGAAAATCCGAAAAAAGTAATTAATTTTACATATTCTTAC 
ACTGGGTATTTATTAAAACTAGGTGTTAATGTTTCAAGTTACAGTTTAgA 
CTTAGAAAAAGATAGCCCCGTTTTTGGTAAGCAACTGAAAGAAGCTAAAA 
AAT T AACT G CT GAT G AT AC AGAAG C TAT T G C CG C AC AAAAAC C T GAT T T A 
AT CAT G GT T T T C GAT C AAGAT C C AAAC AT C AAT ACT C T G AAAAAAAT T GC 
ACCAACTTTAGTTATTAAATATGGTGCACAAAATTATTTAgATATGATGC 
CAGCCTTGGGGAAAGTaTTcGGTAAAGAAAAAGAAGCTAATCAGTGGGTT 
AG C C AAT G G AAAAC T AAAACT C T C G C T GC C AAAAAAG AT T T AC AC CAT AT 
C T T AAAG C CT AAC AC T AC T T T TAG TAT T AT GG AT T T T TAT G AT AAAAAT A 
T CT AT T TAT AT GGT AAT AAT T T T G G AC GC GGT G G AG AAC T AATCT AT GAT 
TCACTAGGTTATGCTGCCCCAGAAAAAGTCAAAAAAGATGTCTTTAAAAA 
AGGGTGGTTTACCGTTTCGCAAGAAGCAATCGGTGATTACGTTGGAGATT 
AT G C C C T T GT T AAT AT AAAC AAAAC GAC T AAAAAAG C AG C T T CAT C AC TT 
AAAG AAAGT GAT GT C T G G AAG AAT T T AC C AG CT GT C AAAAAAG G G C AC AT 
CATAGAAAGTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAG 
AAGCT CAATTAAAAT CATTT AC AAA 

SEQ ID NO. 8608 

STRAIN M781 
GAAGGCTTCACCTATTATGG 

AAAAATTCCTGAAAATCCGAAAAAAGTAATTAATTTTACATATTCTTACA 
CTGGGTATTTATTAAAACTAGGTGTTAATGTTTCAAGTTACAGTTTAGAC 
T T Ag AAAAAG AT AG CCCCGTTTTT GG T AAG C AACT G AAAG AAGC T AAAAA< 
AT T AAC T G C T GAT GAT AC AG AAG C T AT T G C C G C AC AAAAAC C T GAT T T AA 
T CAT GGT T T T C GAT C AAG AT C C AAAC AT C AAT AC T C T GAAAAAAAT T G C A 
CC AACTTTAGT T ATT AAAT ATGGT GCACAAAATT AT TT AgATAT GAT G CC 
AGCCTTGGGGAAAGTATTCGGtAAAGAAAAAGAAGCTAATCAGTGGGTTA 
G C C AAT G G AAAAC T AAAAC T CTCGCTGC C AAAAAAG AT T T AC AC CAT AT C 
TTAAAGCCTAACACTACTTTTACTATTATGGATTTTTATGATAAAAATAT 
CTATTTAT ATGGT AAT AATTTTGGACGCGGTGGAGAACT AATCT AT GAT T 
CACTAGGTTATGCTGCCCCAGAAAAAGTCAAAAAAGATGTCTTTAAAAAA 
GGGT GGT T T AC C GT T T C G C AAGAAG C AAT C G GT G AT T AC GT T G GAG AT T A 
T G C C CT T GT T AAT AT AAAC AAAAC G AC T AAAAAAG C AGC T T CAT C AC T T A 
AAG AAAGT GAT GT C T G G AAG AAT T T ACC AG C T G T C AAAAAAG GG C AC AT C 
ATAGAAAGTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAGA 
AG CT C AAT T AAAAT CAT T T AC AAA 

SEQ ID NO. 8609 

STRAIN CJB110 
GAAGGCTTCACCTATTATGGA 

AAAATTCCTGAAAATCCGAAAAAAGTAATTAATTTTACATATTCTTACAC 
TGGGTATTTATTAAAACTAGGTGTTAATGTTTCAAGTTACAGTTTAGACT 
TAG AAAAAG AT AG CCCCGTTTTT G GT AAGC AAC T G AAAG AAG CTAAAAAA 
T T AACT G C T GAT GAT AC AG AAG CT AT T G C C G C AC AAAAAC C T GAT T T AAT 
CAT G GT T T T C GAT C AAG AT C C AAAC AT C AAT ACT CT GAAAAAAAT T GC AC 
C AACT T T AGT T AT T AAAT AT GGT G C AC AAAAT TAT T T Ag AT AT GAT G C C A 
GCCTTGGGGAAAGTATTCGGTAAAGAAAAAGAAGCTAATCAGTGGGTTAG 
CCAATGGAAAACTAAAACTCTCGCTGCCAAAAAAGATTTACACCATATCT 
T AAAG C C T AAC AC T AC T T T T AC TAT T AT G GAT T T T TAT GAT AAAAAT AT C 
T AT T TAT AT GGT AAT AAT T T T GG AC G C GG t G GAG AAC T AAT C T AT GAT T C 
ACTAGGTTATGCTGCCCCAGAAAAAGTCAAAAAAGATGTCTTTAAAAAAG 
GGTGGTTTACCGTTTCGCAAGAAGCAATCGGTGATTACGTTGGAGATTAT 
GCCCTTGTTAATATAAACAAAACGACTAAAAAAGCAGCTTCATCACTTAA 
AGAAAG T G AT GT C T GGAAGAAT T T AC C AG C T GT C AAAAAAG G GC AC AT C A 
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TAGAAAGTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAGAA 
GCTCAATTAAAATCATTTACAAA 

SEQ ID NO. 8610 

STRAIN 1169NT 

G AAGG C T T C AC C T AT T AT GGAAAAAT T 

CCTGAAAATCCGAAAAAAGTAATTAATTTTACATATTCTTACACTGGGTA 
T T T AT T AAAAC T AGGT GT T AAT GT T T C AAGT T AC AGT T TAG ACT TAG AAA 
AAGATAGCCCCGTTTTTGGTAAGCAACTGAAAGAAGCTAAAAAATTAACT 
G C T GAT G AT AC AG AAG CT AT T GC C g c AC AAa a ACCT GAT T T AAT CAT G GT 
TTTCGATC AAGAT C CAAACAT CAATACT CT GAAAAAAATT GC ACC AACT T 
TAGTTATTAAATATGGTGCACAAAATTATTTAgATATGATGCCAGCCTTG 
GGG AAAGT AT T CGG T AAAG AAAAAGa a G C T AAT C AGT G G G T TAG C C AAT G 
G AAAAC T AAAAC TCTCGCTGC C AAAAAAGAT T T AC AC CAT AT C T T AAAG C 
CTAACACTACTTTTACTATTATGGATTTTTATGATAAAAATATCTATTTA 
TAT GGT AAT AAT T T T G G AC G C GGT GG AG AAC T AAT C TAT GAT T C AC T AGG 
TTATGCTGCCCCAgAAAAAGTCAAAAAAGATGTCTTTAAAAAAGGGTGGT 
TTACCGTTTCgCAAGAAGCAATCGGTGATTACGTTGGAGATTATGCCCTT 
GTTAATATAAACAAAACGACTAAAAAAGCAGCTTCATCACTTAAAGAAAG 
T GAT GT CT GG AAG AAT T T AC C AG C T GT C AAAAAAG G G C AC AT CAT AG AAA 
GTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAGAAGCTCAA 
TTAAAATCATTTACAAA 

SEQ ID NO. 8611 

STRAIN JM9130013 

G AAG G CT T C AC C TAT TAT G 

G AAAAAT T C C T G AAAAT C CG AAAAAAGT AAT T AAT T T T AC AT AT T C T T AC 
AC T GG AT AT T T AT T AAAAC TAG G AGT T AAT GT T T C AAG T T AC AG T T T AGA 
CTTAGAAAAAGATAGCCCCGTTTTTGGTAAGCAACTGAAAGGAGCTAAAA 
AAT T AAC T G C T GAT GAT AC AG AAG CT AT T GC C G C AC AAAAAC C T GAT T T A 
AT CATGGT T T TTGAT CAAGATC CAAACAT CAATACT CT GAAAAAAATT G C 
AC C AACT T T AGTT AT T AAAT ATGGTGCAC AAAATTAT T T AgAT AT GAT G C 
CAGCTTTGGGGAAAGTATTCGGTAAAGAAAAAGAAGCTAATCAGTGGGTT 
AGCCAATGGAAAACTAAAACTCTCGCTGCCAAAAAAGATTTACACCATAT 
CTTAAAACCTAACACTACTTTTACCATTATGGATTTTTATGATAAAAATA 
TCTATTTATATGGTAATAATTTTGGACGCGGtGGAGAACTAATCTATGAT 
T C AC T AG GT T AT G CT G C C C C Ag AAAAAGT C AAAAAAG AT G T C T T T AAAAA 
AGGGT GGT T T AC C GT T T C g C AAGAAG C AAT C GGT G AT T AC GT T G GAGAT T 
AT GC CCT TGTT AAT AT AAACAAAACGACT AAAAAAGC AGCTT C AT CACT T 
AAAG AAAGT GAT GT C T GG AAG AAT T T AC C AG C T GT C AAAAAAG G G C AC AT 
CATAGAAAGTAACTACGACGTGTTTTATTTCTCTGACCCTCTATCTTTAG 
AAG C T C AAT T AAAAT CAT T T AC AAA 

SEQ ID NO. 8612 
STRAIN 2 603 frame: 1 

MKKIGIIVLTLLTFFLVSCGQQTKQESTKTTISKMPKIEGFTYYGKIPENPKKVINFTYS 
YTGYLLKLGVNVSSYSLDLEKDSPVFGKQLKEAKKLTADDTEAIAAQKPDLIMVFDQDPN 
INTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEANQWVSQWKTKTLAVKKDLHHILK 
PNTTFTIMDFYDKNIYLYGNNFGRGGELIYDSLGYAAPEKVKKDVFKKGWFTVSQEAIGD 
YVGDYALVNINKTTKKAASSLKESDVWKNLPAVKKGHIIESNYDVFYFSDPLSLEAQLKS 
FTKAIKENTN 

SEQ ID NO. 8613 

STRAIN 0 90 frame: 1 

EGFTYYGKIPENPKKVINFTYSYTGYLLKLGVNVSSYSLDLEKDSPVFGKQLKEAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QWVSQWKTKTLAAKKDLHHILKPNTTFTIMDFYDKNIYLYGNNFGRGGELIYDSLGYAAP 
EKVKKDVFKKGWFTVSQEAIGDYVGDYALVNINKTTKKAASSLKESDVWKNLPAVKKGHI 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8614 

STRAIN A90 9 frame: 1 

EGFTYYGKIPENPKKVINFTYSYTGYLLKLGVNVSSYSLDLEKDSPVFGKQLKGAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
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QWVSQWKTKTLAAKKDLHHILKPNTTFTIMDFYDKNIYLYGNNFGRGGELIYDSLGYAAP 
EKVKKDVFKKGWFT VS QE AI GDY VGD YALVN INKTTKKAAS S LKE S DVWKNL PAVKKGHI 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8615 

STRAIN H3 6B frame: 1 

EGFT YYGKI PENPKKVINFT YS YTGYLLKLGVNVS S YS LDLEKDS PVFGKQLKGAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QW VS QWKTKT LAAKKDLHH I LRPNTT FT 1 1 D FYDKN I YL YGNN FGRGGE L I YD S LG YAAP 
EKVKKD VFKKGW FT VS QE AI GD YVG DYALVNINKTTKKAASS LKE S DVWKNL PAVKKGHI 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8616 

STRAIN 18RS21 frame: 1 

EGFTYYGKIPENPKKVINFTYS YTGYLLKLGVNVS SYSLDLEKDSPVFGKQLKEAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QWVS QWKTKT LAVKKDLHHILKPNTT FT IMD FYDKN I YLYGNN FGRGGE LI YDS LG YAAP 
EKVKKD VFKKGW FT VSQEAIGD YVG DYALVNINKTTKKAASS LKE SDVWKNLPAVKKGH I 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8617 

STRAIN M7 32 frame: 1 

EGFTYYGKIPENPKKVINFTYS YTGYLLKLGVNVS SYSLDLEKDSPVFGKQLKEAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QWVSQWKTKTLAAKKDLHHI LKPNTTFTIMDFYDKNI YLYGNN FGRGGELIYDSLGYAAP 
E KVKKD VFKKGW FT V S QE AI GD Y VGD YALVN I NKTTKKAASS LKE SDVWKNLPAVKKGH I 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8618 

STRAIN COH1 frame: 1 

EGFTYYGKIPENPKKVINFTYS YTGYLLKLGVNVS SYSLDLEKDSPVFGKQLKEAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
Q W VS QWKTKT LAAKK DLHH I LKPNTT FT IMD FYDKN I YLYGNN FGRGGELIYDSLGYAAP 
EKVKKDVFKKGWFTVSQEAIGDYVGDYALVNINKTTKKAASSLKESDVWKNLPAVKKGHI 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8619 

STRAIN M7 81 frame: 1 

EGFTYYGKIPENPKKVINFTYS YTGYLLKLGVNVS SYSLDLEKDSPVFGKQLKEAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QWVSQWKTKTLAAKKDLHHILKPNTTFTIMDFYDKNI YLYGNN FGRGGELIYDSLGYAAP 
EKVKKD VFKKGW FT VSQEAIGD YVG DYALVNINKTTKKAASS LKE SDVWKNLPAVKKGH I 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8620 

STRAIN CJB110 frame: 1 

EGFTYYGKIPENPKKVINFTYS YTGYLLKLGVNVS SYSLDLEKDSPVFGKQLKEAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QWVSQWKTKTLAAKKDLHHILKPNTTFTIMDFYDKNI YLYGNN FGRGGELIYDSLGYAAP 
EKVKKDVFKKGWFTVSQEAIGDYVGDYALVNINKTTKKAASSLKESDVWKNLPAVKKGHI 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8621 

STRAIN 1169NT frame: 1 

EGFTYYGKIPENPKKVINFTYS YTGYLLKLGVNVS SYSLDLEKDSPVFGKQLKEAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
QWVSQWKTKTLAAKKDLHHILKPNTTFTIMDFYDKNI YLYGNN FGRGGELIYDSLGYAAP 
EKVKKD VFKKGW FT VSQEAIGDYVGDYALVNINKTTKKAASS LKE SDVWKNLPAVKKGHI 
IESNYDVFYFSDPLSLEAQLKSFT 

SEQ ID NO. 8622 

STRAIN JM9130013 frame: 1 

EGFT YYGKI PEN PKKVINFTYSYTGYLLKLGVNVSSYSLDLEKDS PVFGKQLKGAKKLTA 
DDTEAIAAQKPDLIMVFDQDPNINTLKKIAPTLVIKYGAQNYLDMMPALGKVFGKEKEAN 
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QWVSQWKTKTLAAKKDLHHILKPNTTFTIMDFYDKNIYLYGNNFGRGGELIYDSLGYAAP 
E KVKK D V FKKGW FT V S QE A I G D Y VG D Y AL VN INKT T KKAAS S LKE S D VWKN L P AVKKGH I 
IESNYDVFYFS DPLSLEAQLKS FT 

SEQ ID NO. 8701 
STRAIN 2603 

ATGAAATTATCGAAGAAGTTATTGTTTTCGGCTGCTGTT 

TTAACAATGGTGGCGGGGTCAACTGTTGAACCAGTAGCTCAGTTTGCGACTGGAATGAGT • 
ATTGTAAGAGCTGCAGAAGTGTCACAAGAACGCCCAGCGAAAACAACAGTAAATATCTAT 
AAAT T AC AAG C T GAT AGT T AT AAAT CGG AAATT AC T T C T AAT GGT GGT AT C G AG AAT AAA 
GACGGCGAAGTAATATCTAACTATGCTAAACTTGGTGACAATGTAAAAGGTTTGCAAGGT 
GTACAGTTTAAACGTTATAAAGTCAAGACGGATATTTCTGTTGATGAATTGAAAAAATTG 
ACAACAGTTGAAGCAGCAGATGCAAAAGTTGGAACGATTCTTGAAGAAGGTGTCAGTCTA 
CCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGATTCAAAAAGTAATGTG 
AGATACTTGTATGTAGAAGATTTAAAGAATTCACCTTCAAACATTACCAAAGCTTATGCT 
GTACCGTTTGTGTTGGAATTACCAGTTGCTAACTCTACAGGTACAGGTTTCCTTTCTGAA 
ATTAATATTTACCCTAAAAACGTTGTAACTGATGAACCAAAAACAGATAAAGATGTTAAA 
AAAT T AG GT C AG G AC GAT G C AGGT TAT AC GAT T GG T GAAG AAT T C AAAT G G T T C T T GAAA 
TCTACAATCCCTGCCAATTTAGGTGACTATGAAAAATTTGAAATTACTGATAAATTTGCA 
GAT G G CT T G AC T TAT AAAT CT GT T GGAAAAAT CAAGAT T G GT T CG AAAAC AC T G AAT AG A 
GAT GAG C AC T AC ACT AT T GAT G AAC C AAC AGT T G AT AAC C AAAAT AC AT T AAAAAT T ACG 
TTTAAACCAGAGAAATTTAAAGAAATTGCTGAGCTACTTAAAGGAATGACCCTTGTTAAA 
AATCAAGATGCTCTTGATAAAGCTACTGCAAATACAGATGATGCGGCATTTTTGGAAATT 
C C AGT T G CAT C AAC TAT T AAT G AAAAAG C AGT T T T AGG AAAAG C AAT T GAAAAT AC T T TT 
G AAC T T C AAT AT G AC CAT ACT C C T G AT AAAG C T G AC AAT C C AAAAC CAT C T AAT C C T C C A 
AG AAAAC C AG AAG T T CAT ACT GGT GGG AAAC G AT T T GT AAAG AAAG AC T C AAC AGAAAC A 
C AAAC ACT AGGT GGT GCTGAGTTTGATTTGTTGGCTTCTGATGGGACAGC AGT AAAAT GG 
ACAGATGCTCTTATTAAAGCGAATACTAATAAAAACTATATTGCTGGAGAAGCTGTTACT 
G GG C AAC C AAT C AAAT T G AAAT C AC AT AC AGAC GG T AC GT T T G AG AT T AAAGGT T T GG C T 
TAT G C AGT T GAT G C G AAT G C AG AG GG T AC AG C AGT AAC T T AC AAAT T AAAAG AAAC AAAA 
GC AC C AG AAGGT T AT GT AAT CC C T GAT AAAG AAAT CG AGT T T AC AGT AT C AC AAAC AT C T 
TAT AAT AC AAAAC C AACT G AC AT C AC GGT T GAT AGT G C T GAT G C AAC AC C T GAT AC AAT T 
AAAAACAACAAACGTCCTTCAATCCCTAATACTGGTGGTATTGGTACGGCTATCTTTGTC 
GCTATCGGTGCTGCGGTGATGGCTTTTGCTGTTAAGGGGATGAAGCGTCGTACAAAAGAT 
AAC 

SEQ ID NO. 8702 

STRAIN 090 

GCAGAAGTGTC AC AAG AACGCCCAGCG AAAAC 

AGCAGTAAATATCTATAAATTACAAGCTGATAGTTATAAATCGGAAATTA 
C T T CT AAT GGT G GT AT C G AG AAT.AAAG AC G G CG AAGT AAT AT C T AAC TAT 
GCTAAACTTGGTGACAATGTAAAAGGTTTGCAAGGTGTACAGTTTAAACG 
T T AT AAAGT C AAG AC GGAT AT T T CT G T T GAT G AAT T G AAAAAAT T G AC AA 
CAGTTGAAGCAGCAGATGCAAAAGTTGGAACGATTCTTGAAGAAGGTGTC 
AGTCTACCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGA 
T T C AAAAAGT AAT G T GAG AT ACT T GT AT G TAG AAG AT T T AAAGAAT T C AC 
CTTCAAACATTACCAAAGCTTATGCTGTACCGTTTGTGTTGGAATTACCA 
GTTGCTAACTCTACAGGTACAGGTTTCCTTTcTGAAATTAATATTTACCC 
T AAAAAC G T T GT AAC T GAT G AAC C AAAAAC AG AT AAAG AT GT T AAAAAAT 
TAG G T C AGG AC GAT G C AG GT TAT AC GAT T G G T GAAG AAT T C AAAT G GTT C 
T T G AAAT CT AC AAT C C CT G C C AAT TT AGGT G AC TAT G AAAAAT T T GAAAT 
T ACT G AT AAATT TGC AGAT GGCTTG ACT TAT AAAT CTGTTGG AAAAAT C A 
AG AT T GGT T C GAAAAC ACT G AAT AG AG AT GAG C AC T AC AC T AT T GAT G AA 
C C AAC AGTT G AT AAC C AAAAT AC AT T AAAAAT T ACGT T T AAAC C AGAG AA 
AT T T AAAG AAAT T G C T GAG C T AC T T AAAG G AAT G AC C CT T GT T AAAAAT C 
AAGATGCTCTTGATAAAGCTACTGCAAATACAGATGATGCGGCATTTTTG 
GAAAT T C C AGT T G CAT C AAC TAT T AAT G AAAAAG C AGT T T T AG G AAAAGC 
AAT T GAAAAT AC T T T T G AAC T T C AAT AT G AC CAT ACT C C T GAT AAAG C T G 
AC AAT C C AAAACC AT CT AAT CCT CCAAG AAAACCAG AAGTT C AT ACTGGT 
G G G AAAC GAT T T G T AAAG AAAG AC T C AAC AG AAAC AC AAAC AC TAG GT G G 
TGCTGAGTTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGGACAG 
AT G CT C T TAT T AAAG C G AAT ACT AAT AAAAAC TAT AT T G C T G GAG AAG C T 
GTTACTGGGCAACCAATCAAATTGAAATCACATACAGACGGTACGTTTGA 
GATTAAAGGTTTGGCTTATGCAGTTGATGCGAATGCAGAGGGTACAGCAG 
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T AAC T T AC AAAT T AAAAG AAAC AAAAG C AC C AG AAGGT T AT GT AAT C CCT 
GAT AAAG AAAT C G AGT T T AC AGT AT C AC AAAC AT C T T AT AAT AC AAAAC C 
AACTGACAT CACGGTTGATAGTGCT GATGCAAC AC CT GAT AC AATTAAAA 
ACAACAAACGTCCTTCA 

SEQ ID NO. 8703 

STRAIN A909 

GCAGAAGTGTCACAAGAACGCCCAGCGAA 

AAC AAC AG T AAAT AT C T AT AAAT T AC AAG C T GAT AGT TAT AAAT C GG AAA 
TTACTT CT AAT GGT GGT ATCGAGAAT AAAGACGGCGAAGT AAT AT CT AAC 
TATGCTAAACTTGGTGACAATGTAAAAGGTTTGCAAGGTGTACAGTTTAA 
ACG T T AT AAAGT C AAG AC GGAT AT TT CTGT TG AT G AAT T G AAAAAAT T GA 
C AAC AGT T GAAG C AGC AG AT GC AAAAGT T GG AACG AT T CT T G AAG AAG GT 
GTCAGTCTACCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCT 
GGATTCAAAAAGTAATGTGAGATACTTGTATGTAGAAGATTTAAAGAATT 
CACCTTCAAACATTACCAAAGCTTATGCTGTACCGTTTGTGTTGGAATTA 
CCAGTTGCTAACTCTACAGGTACAGGTTTCCTTTCTGAAATTAATATTTA 
CCCTAaaAACGTTGTAACTGATGAACCAAAAACAGATAAAGATGTTAAAA 
AAT T AGG T C AG G AC GAT G C AGGT TAT ACG AT T GGT GAAG AAT T C AAAT GG 
TTCTTGAAATCTACAATCCCTGCCAATTTAGGTGACTATGAAAAATTTGA 
AAT T AC T GAT AAAT T T G C AGAT G G C T T G AC T TAT AAAT CTGT T GGAAAAA 
T C AAG AT T GGT T CG AAAAC AC T GAAT AG AGAT GAG C ACT AC AC TAT T GAT 
GAACCAACAGTTGATAACCAAAATACATTAAAAATTACGTTTAAACCAGA 
GAAATTT AAAGAAATTGCT GAGCT ACTT AAAGGAATGACC CT T GT TAAAA 
AT C AAG AT G C T C T T GAT AAAG C T AC T G C AAAT AC AG AT GAT G C GG CAT T T 
T T GG AAAT T C C AGT T GC AT C AAC TAT T AAT GAAAAAG C AGT T T T AGG AAA 
AG C AAT T G AAAAT AC T T T T G AACT T C AAT AT G AC CAT AC t C C T GAT AAAG 
CT GACAAT CC AAAAC CAT CTAAT CCT C C AAG AAAAC C AG AAG T T CAT ACT 
GGT G GG AAAC GAT T T GT AAAGAAAG AC T C AAC AG AAAC AC AAAC ACT AGG 
TGGTGCTGAGTTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGGA 
CAGATGCTCTTATTAAAGCGAATACTAATAAAAACTATATTGCTGGAGAA 
GCTGTTACTGGGCAACCAATCAAATTGAAATCACATACAGACGGTACGTT 
T G AG AT T AAAGGT T T G G C T T AT G C AGT T GAT G C GAAT GC AG AGGGT AC AG 
C AG T AACT T AC AAAT T AAAAG AAAC AAAAG C AC C AGAAG GT T AT GT AAT C 
CCT GAT AAAG AAAT C GAGT T T AC AGT AT C AC AAAC AT C T TAT AAT AC AAA 
AC C AACT G AC AT C AC G GT T GAT AGT G C T GAT G C AAC AC C T GAT AC AAT T A 
AAAACAACAA 

SEQ ID NO. 8704 

STRAIN 18RS21 

G C AG AAGT GT C AC AAGAACG C C C AG C GAAAAC 

AGCAGTAAATATCTATAAATTACAAGCTGATAGTTATAAATCGGAAATTA 
CT T CT AAT GGT GGT AT C G AG AAT AAAGACGGCGAAGT AAT AT CT AAC TAT 
G C T AAAC T T GG T G AC AAT GT AAAAGGT T T G C AAG GT GT AC AGT T T AAACG 
T TAT AAAGT C AAG AC G GAT AT T T CT GT T GAT GAAT T G AAAAAAT T G AC AA 
C AGT T GAAG C AG C AG AT G C AAAAGT T G G AAC GAT T CT T GAAG AAGGT GT C 
AGTCTACCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGA 
TT CAAAAAGT AAT GT GAGAT ACT T GT AT GT AGAAGAT T TAAAGAATT CAC 
CTTCAAACATTACCAAAGCTTATGCTGTACCGTTTGTGTTGGAATTACCA 
GTTGCTAACTCTACAGGTACAGGTTTCCTTTCTGAAATTAATATTTACCC 
TAAAAACGTTGTAACTGATGAACCAAAAACAGATAAAGATGTTAAATAAT 
TAGGTCAGGACGATGCAGGTTATACGATTGGTGAAGAATTCAAATGGTTC 
T T G AAAT CT AC AAT C C CT G C C AAT T T AGG T G ACT AT G AAAAAT T T G AAAT 
T AC T GAT AAAT T T G C AG AT G G C T T G AC T TAT AAAT C T G T T GGAAAAAT C A 
AGAT T GGT T C GAAAAC AC T GAAT AG AG AT GAG C AC T AC AC T ATT GAT G AA 
C C AAC AGT T GAT AAC C AAAAT AC ATT AAAAAT T AC GTTT AAAC CAg AG AA 
AT T T AAAG AAAT T G C T GAG C T ACT T AAAGG AAT G AC C CT T G T T AAAAAT C 
AAG AT G CT C T T GAT AAAG C T AC T GC AAAT AC AG AT GAT G C GGC AT T T T T G 
G AAAT T C C AGT T G CAT C AAC TAT T AAT G AAAAAGC AGT T T T AG GAAAAG C 
AAT T G AAAAT ACT T T T G AAC T T C AAT AT G AC CAT AC T C C T G At AAAG C t G 
AC AAT C C AAAAC CAT C T AAT CCT C C AAG AAAAC CAG AAGT T C AT ACT G GT 
GG G AAAC GAT T T GT AAAGAAAG ACT C AAC AG AAAC AC AAAC ACT AGGT G G 
TGCTGAGTTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGGACAG 
ATGCTCTTATTAAAGCGAATACTAATAAAAACTATATTGCTGGAGAAGCT 
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GT T ACT G G GC AAC C AAT C AAAT T GAAAT C AC AT AC AG AC G GT AC G T T T G A 
GATTAAAGGTTTGGCTTATGCAGTTGATGCGAATGCAGAGGGTACAGCAG 
T AACT T ACAAAT T AAAAGAAAC AAAAG C AC C AGAAGGT T AT GT AAT C C CT 
G AT AAAGAAAT CG AGT T T AC AGT AT C AC AAAC AT C T T AT AAT AC AAAACC 
AACT G AC AT C AC GGT T GAT AGT GCT GAT G C AAC AC C T GAT AC AAT T AAAA 
AC AAC AAACG T C C T T C A 

SEQ ID NO. 8705 

STRAIN M732 

GCAGAAGTGT CACAAGAACGCCCAGCGAAAACAACAGT 
AAATATCTATAAATTACAAGCTGATAGTTATAAATCGGAAATTACTTCTA 
AT GGT G GT AT C G AG AAT AAAGAC G G C G AAG T AAT AT C T AAC TAT G C T AAA 
CTTGGTGACAATGTAAAAGGTTTGCAAGGTGTACAGTTTAAACGTTATAA 
AGT C AAG ACG G AT AT T T C T GT T GAT G AAT T G AAAAAAT T G AC AAC AG t T G 
AAG C AG C AGAT G C AAAAG T T GG AACG AT T C T T G AAG AAG GT GT C AG T CT A 
CCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGATTCAAA 
AAG T AAT G T G AGAT AC T T GT AT GT AG AAG AT T T AAAG AAT T C AC C T T C AA 
AC AT T AC C AAAGC T TAT GCT GT AC C G T T T G T GT T GG AAT T AC C AGT T G CT 
AACTCTACAGGTACAGGTTTCCTTTCTGaAATTAATATTTACCCTAAAAA 
CGT T GT AAC T GAT G AAC CAAAAAC AGAT AAAGAT GT T AAAAAAT TAGGT C 
AGG AC GAT G C AG GT TAT AC GAT T G GT G AAG AAT T C AAAT GG T T CT T G AAA 
TCTACAATCCCTGCCAATTTAGGTGACTATGAAAAATTTGAAATTACTGA 
T AAAT T T G C AG AT G G CT T G ACT TAT AAAT C T GT T G GAAAAAT C AAGAT T G 
G T T C G AAAAC ACT G AAT AGAG AT GAG C AC T AC AC TAT T GAT G AAC C AAC A 
GTTGATAACCAAAATACATTAAAAATTACGTTTAAACCAGAGAAATTTAA 
AG AAAT T G C T GAG C T AC T T AAAG G AAT G AC C C T T G T T AAAAAT C AAG AT G 
CTCTTGATAAAGCTACTGCAAATACAGATGATGCGGCATTTTTGGAAATT 
CCAGTTGCAT CAACT ATT AAT GAAAAAGCAGT T TT AGGAAAAGC AAT T GA 
AAAT ACT T T T G AAC T T C AAT AT G AC CAT ACT C C T GAT AAAG C T G AC AAT C 
C AAAAC CAT C T AAT C C T C C AAG AAAAC C AG AAGT T CAT ACT GGT G GG AAA 
C GAT T T GT AAAG AAAG AC T C AAC AG AAAC AC AAAC ACT AGG T G GT GCT G A 
GTTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGGACAGATGCTC 
T TAT T AAAG CG AAT AC T AAT AAAAAC T AT AT T G C T G GAG AAG C T G T T AC T 
G GGC AAC C AAT C AAAT T GAAAT C AC AT AC AGAC G GT AC GT T T GAG AT T AA 
AG GTTTGGCT TAT G C AGT T G AT G C G AAT G C AG AG G GT AC AG C AGT AACT T 
AC AAAT T AAAAG AAAC AAAAG C AC C AG AAGGT T AT GT AAT C C C T GAT AAA 
GAAAT C G AGT T T AC AGT AT C AC AAAC AT CT TAT AAT AC AAAAC CAACT G A 
CATCACGGTTGATAGTGCTGATGCAACACCTGATACAATTAAAAACAACA 
AACGTCCTTCA 

SEQ ID NO. 8706 

STRAIN COH1 

GCAGAAGTGT CACAAGAACGCCCAGCGAAAAC 

AGC AGT AAAT AT C TAT AAAT T AC AAG C T GAT AG T TAT AAAT C G GAAAT T A 
C T T n T AAT GGT G GT AT C GAGAAT AAAG ACG G C G AAG T AAT AT C T AAC TAT 
GCT AAACT T GGT G AC AAT GT AAAAGGT TTG CAAGGT GT AC AGT T T AAACG 
T T AT AAAGT C AAG AC GGAT AT T T C T GT T GAT G AAT T GAAAAAAT T GAC AA 
C AGT T G AAG C AG C AGAT G C AAAAGT T G G AAC GAT T C T T G AAG AAGGT GT C 
AGTCTACCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGA 
T T C AAAAAGT AAT GT GAG AT AC T T GT AT G TAG AAG AT T T AAAG AAT T C AC 
CTTCAAACATTACCAAAGCTTATGCTGTACCGTTTGTGTTGGAATTACCA 
G T T G C T AAC T C T AC AGGT AC AGGT TTCCTTTCT GAAAT T AAT AT T T AC C C 
T AAAAACGT T GT AACT GAT G AAC CAAAAAC AGAT AAAGAT GT T AAAAAAT 
TAG G T C AGG AC GAT G C AG GT TAT AC GAT T G GT G AAG AAT T C AAAT G GT T C 
TT GAAAT CTACAATCCCTGCCAATTT AGGT GACTATGAAAAATTT GAAAT 
TACT GAT AAAT T T GC AG AT G G CT T G AC T TAT AAAT C T GT T G GAAAAAT C A 
AGAT T G G T T C GAAAAC ACT G AAT AG AG AT GAG C AC T AC AC T AT T GAT G AA 
CCAACAGTT GAT AACCAAAATACAT T AAAAAT TACGTT TAAACC AGAG AA 
ATTTAAAGAAATTGCTGAGCTACTTAAAGGAATGACCCTTGTTAAAAATC 
AAGAT GCT CT T G AT AAAGCT AC T G C AAAT AC AG AT GAT G C G G C AT T T T T G 
GAAAT TC C AGT TGCAT CAACT AT T AAT GAAAAAGCAGT T T T AGGAAAAGC 
AAT T G AAAAT ACT T T T G AA C T T C AAT AT GAC CAT AC T C C T GAT AAAG C T G 
AC AAT C C AAAAC CAT CT AAT C CT C C AAG AAAAC C AG AAGT T CAT AC T G GT 
GGGAAAC GAT TTG TAAAGAAAG ACT C AAC AG AAAC ACAAAC ACT AGGT GG 
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T GCT GAGT T T GAT TTGTTGGCTTCT GAT GGG AC AG C AGT AAAAT GG AC AG 
ATGCTCTTATTAAAGCGAATACTAATAAAAACTATATTGCTGGAGAAGCT 
G T T AC T G G G C AAC C AAT C AAAT T G AAAT C AC AT AC AG AC G GT AC GT T T G A 
GATTAAAGGTTTGGCTTATGCAGTTGATGCGAATGCAGAGGGTACAGCAG 
T AAC T T AC AAAT T AAAAG AAAC AAAAG C AC C AGAAG G TT AT G T AAT C C CT 
GATAAAGAAAT CGAGT TTACAGT AT CAC AAACAT CTT AT AAT ACAAAACC 
AACTGACATCACGGTTGATAGTGCTGATGCAACACCTGATACAATTAAAA 
ACAACAAACGTCCTTCA 

SEQ ID NO. 8707 

STRAIN M7 81 

G C AGAAGTG T C AC AAG AAC G C C C AG CGAAAAC AG 

C AG T AAAT AT C TAT AAAT T AC AAG C T G AT AGT T AT AAAT C G G AAAT T AC T 
T CT AAT GGT GG TAT CG AGAAT AAAG ACGG C GAAGT AAT AT CT AACT AT GC 
TAAACTTGGTGACAATGTAAAAGGTTTGCAAGGTGTACAGTTTAAACGTT 
AT AAAGT C AAG AC G GAT AT T T CT GT TGAT G AAT T G AAAAAAT T G AC AAC A 
GT T GAAG C AG C AGAT G C AAAAGT T G G AAC GAT T C T T G AAGAAGG T GT C AG 
TCTACCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGATT 
CAAAAAGT AAT GT G AGAT ACTTGTAT GT AGAAGATT TAAAGAATT C AC CT 
TCAAACATTACCAAAGCTTATGCTGTACCGTTTGTGTTGGAATTACCAGT 
TGCTAACTCTACAGGTACAGGTTTCCTTTCTGaAATTAATATTTACCCTA 
AAAACGT T GT AAC T GAT G AAC C AAAAAC AG AT AAAG AT G T T AAAAAAT T A 
GGT C AGG AC GAT G C AG G T T AT ACG AT T GGT GAAG AAT T C AAAT G GT T C T T 
G AAAT CT AC AAT C CCT G C C AAT TT AG GTG ACT AT G AAAAAT T T GAAAT T A 
CTGATAAATTTGCAGATGGCTTGACTTATAAATCTGTTGGAAAAATCAAG 
AT T G GT T C G AAAAC ACT G AAT AG AG AT GAG CAC TAG AC TAT T GAT G AAC C 
AAC AGT T GAT AAC C AAAAT AC AT T AAAAAT T AC GT T T AAAC C AG AG AAAT 
T T AAAG AAAT T G C T G AGC T AC T T AAAGG AAT G AC C C T T G T T AAAAAT C AA 
GATGCT CTT GAT AAAGCT ACT GC AAAT ACAGATGATGCGGCATTTTTGGA 
AAT T C C AGT T G C AT C AACT AT T AAT G AAAAAG C AGT T T T AGG AAAAG C AA 
T T G AAAAT AC T T T T GAACT T C AAT AT G AC CAT ACT CCT GAT AAAG C T GAC 
AATCCAAAACCATCTAATCCTCCAAGAAAACCAGAAGTTCATACTGGTGG 
GAAACGATTT GTAAAGAAAGACT CAACAGAAACAC AAAC ACTAGGT GGT G 
CT G AGT T T GAT TTGTTGGCTTCT GAT G GG AC AG C AGT AAAAT GG AC AG AT 
GCTCTTATTAAAGCGAATACTAATAAAAACTATATTGCTGGAGAAGCTGT 
TAG T GG G C AAC C AAT C AAAT T GAAAT C AC AT AC AG ACGGT AC GT T T GAGA 
TTAAAGGTTTGGCTTATGCAGTTGATGCGAATGCAGAGGGTACAGCAGTA 
AC T T AC AAAT TAAAAGAAAC AAAAG CAC C AG AAGG T T AT GT AAT C C C T G A 
T AAAG AAAT C GAGT T T AC AGT AT CAC AAAC AT C T TAT AAT AC AAAAC C AA 
C T GAC AT C AC G GT T GAT AGT GCT GAT G C AAC AC CT GAT AC AAT T AAAAAC 
AAC AAAC GT 

SEQ ID NO. 8708 

STRAIN CJB110 

G C AGAAG T GT C AC AAGAAC G C C C AG C G AA 

AAC AG C AGT AAAT AT C TAT AAAT T AC AAG C T GAT AG T TAT AAAT T G GAAA 
TTACTTCTAATGGTGGTATCGAGAATAAAGACGGCGAAGTAATATCTAAC 
TATGCTAAACTTGGTGACAATGTAAAAGGTTTGCAAGGTGTACAGTTTAA 
AC GT TAT AAAG T C AAG AC GG AT AT T T CT G T T GAT G AAT T G AAAAAAT T G A 
C AAC AGT T G AAG C AG C AG AT G C AAAAGT T G G AAC GAT T CT T GAAG AAGG T 
GTCAGTCTACCTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCT 
GG AT T CAAAAAGT AAT G T GAG AT AC T T G T AT G TAG AAG AT T T AAAG AAT T 
CACCTTCAAACATTACCAAAGCTTATGCTGTACCGTTTGTGTTGGAATTA 
CCAGTTGCTAACTCTACAGGTACAGGTTTCCTTTCTGAAATTAATATTTA 
CCCTAAAAACGTTGTAACTGATGAACCAAAAACAGATAAAGATGTTAAAA 
AATTAGGTCAGGACGATGCAGGTTATACGATTGGTGAAGAATTCAAATGG 
T T C T T GAAAT CT AC AAT C C CT G C C AAT T T AG GT G ACT AT G AAAAAT T T G A 
AAT T AC T GAT AAAT T T G C AG AT G G CT T GAC T TAT AAAT C T GT T GG AAAAA 
T C AAG AT T G GT T C G AAAAC AC T G AAT AG AG AT GAG C AC T AC ACT AT T GAT 
GAAC C AAC AGT T GAT AAC C AAAAT AC AT T AAAAAT T ACGTT T AAAC C AGA 
GAAAT T T AAAG AAAT T G CT GAG C T AC T T AAAGG AAT GAC C C T T GT T AAAA 
AT C AAG AT G C T C T T GAT AAAG CT AC T G C AAAT AC AG AT GAT G C GG C AT T T 
T T G GAAAT T C C AGT T G CAT C AAC T AT T AAT G AAAAAG C AG T T T TAG GAAA 
AGC AAT T G AAAAT ACTTTTG AAC TTC AAT AT GAC CAT ACT CCT GAT AAAG 



381 



WO 2004/018646 



PCT/US2003/026827 



SEQUENCE LISTING 



C T G AC AAT c C AAAAC CAT C T AAT C C T C C AAGAAAAC C AG AAGT T CAT AC T 
GGT GGGAAACGAT T T GTAAAGAAAGACT CAACAGAAAC ACAAAC ACT AGG 
TGGTGCTGAGTTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGGA 
C AG AT G C T CT T AT T AAAG C G AAT AC T AAT AAAAACT AT AT T G CT GGAG AA 
GCTGTTACTGGGCAACCAATCAAATTGAAATCACATACAGACGGTACGTT 
TGAGATTAAAGGTTTGGCTTATGCAGTTGATGCGAATGCAGAGGGTACAG 
C AGT AAC T T AC AAAT T AAAAG AAAC AAAAGC AC C AG AAGGT TAT GT AAT C 
C C T GAT AAAG AAAT C G AGT T T AC AGT AT C AC AAAC AT C T TAT AAT C C AAA 
AC C AAC T G AC AT C AC GG T T GAT AGT G C T GAT G C AAC AC CT GAT AC AAT T A 
AAAAC AAC AAAC GT C CT T C A 

SEQ ID NO. 8709 

STRAIN JM9130013 

G C AGAAG T GT C AC AAG AAC G C C C AG C G AAAAC AG C AGT A 
AAT AT CT AT AAAT T AC AAG C T GAT AG T TAT AAAT CG G AAAT T AC T T C T AA 
T GGT G G T AT C G AG AAT AAAGAC G G C G AAG T AAT AT CT AACT AT G CT AAAC 
T T GG T G AC AAT GT AAAAGGT T T G C AAG G T GT AC AGT T T AAAC GT TAT AAA 
GT C AAG AC G GAT AT T T C T GT T GAT GAAT T G AAAAAAT T G AC AAC AGT T GA 
AG C AG C AG AT G CAAAAGT T GG AAC GAT T C T T G AAG AAGGT GT CAGT C T AC 
CTCAAAAAACTAATGCTCAAGGTTTGGTCGTCGATGCTCTGGATTCAAAA 
AG T AAT GT GAG AT AC T T G T AT G T AGAAG AT T T AAAG AAT T C AC C T T C AAA 
CAT T AC C AAAG CT T AT GC T GT AC CGTTTGTGT T GG AAT T AC C AGT T G CT A 
ACTCTACAGGTACAGGTTTCCTTTCTGAAATTAATATTTACCCTAAAAAC 
G T T G T AACT GAT GAAC C AAAAAC AG AT AAAG AT GT T AAAAAAT T AGG T C A 
G G AC GAT GC AGGT T AT AC GAT T G GT G AAG AAT T C AAAT GGT T CT T G AAAT 
CT AC AAT CC CTG C C AAT T TAG GT G ACT AT G AAAAAT T T GAAATT ACT GAT 
AAATTTGCAGATGGCTTGACTTATAAATCTGTTGGAAAAATCAAGATTGG 
T T C G AAAAC AC T GAAT AG AG AT G AG C ACT AC AC TAT T GAT G AAC C AAC AG 
T TGAT AACCAAAAT ACAT T AAAAAT TACGT T TAAAC C AGAGAAATT T AAA 
G AAAT T GCT G AGCT AC T T AAAGGAAT G AC C C T T GT T AAAAAT C AAG AT G C 
T CT T GAT AAAG C T AC T G C AAAT AC AGAT GAT GC GGC AT T T T T G G AAAT T C 
C AGT T G CAT C AAC TAT T AAT G AAAAAGC AG T T T TAG G AAAAGC AAT T G AA 
AATACTTTTGAACTTCAATATGACCATACTCCTGATAAAGCTGACAATCC 
AAAAC CAT C T AAT c CT c C AAGAAAAC C AG AAGT T CAT ACT G GT G G G AAAC 
GAT TT GTAAAGAAAGACT C AAC AGAAACACAAAC ACT AGGTGGTGCTGAG 
TTTGATTTGTTGGCTTCTGATGGGACAGCAGTAAAATGGACAGATGCTCT 
T AT T AAAG CG AAT AC T AAT AAAAAC TAT AT T G CT G GAG AAG C T G T T ACTG 
G G C AAC C AAT C AAAT T G AAAT C AC AT AC AG AC G GT AC GT T T GAG ATT AAA 
GGTTTGGC T T AT G C AGT T GAT G C GAAT G C AG AGGG T AC AG C AG T AAC T T A 
CAAAT T AAAAGAAACAAAAGCAC CAGAAGGT TATGT AAT CC CT GAT AAAG 
AAAT C G AGT T TAG AG TAT C AC AAAC AT C T TAT AAT AC AAAAC C AAC T G AC 
AT C AC G GT T GAT AGT GCT GAT G C AAC AC C T GAT AC AAT T AAAAAC AAC AA 
ACGTCCTTCA 

SEQ ID NO. 8710 
STRAIN 2603 frame: 1 

MKLSKKLLFSAAVLTMVAGSTVEPVAQFATGMSIVRAAEVSQERPAKTTVNIYKLQADSY 
KSE IT SNGG IENKDGE VI SNYAKLGDNVKGLQGVQFKRYKVKT D I S VDE LKKLTT VEAAD 
AKVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLYVEDLKNSPSNITKAYAVPFVLEL 
PVANSTGTGFLSEINIYPKNVVTDE PKTDKDVKKLGQDDAGYTIGEEFKWFLKSTIPANL 
GDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHYTIDEPTVDNQNTLKITFKPEKFK 
E I AE LLKGMT L VKNQ DAL DKAT ANT D DAAFLE I P VAS T INEKAVLGKAI ENT FE LQ Y DHT 
PDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLGGAEFDLLAS DGTAVKWTDALIKA 
NTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVDANAEGTAVTYKLKETKAPEGYVI 
PDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNKRPSIPNTGGIGTAIFVAIGAAVM 
AFAVKGMKRRTKDN 

SEQ ID NO. 8711 

STRAIN 090 frame: 1 

AEVSQERPAKTAVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLY 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQ 
DDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHY 
TIDEPTVDNQNTLKITFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAFLEIPVAS 
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TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAE FDLLAS DGTAVKWT DAL IKANTNKN Y I AGEAVTGQP IKLKS HT DGT FE I KGLAYAVD 
ANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNK 
RPS 

SEQ ID NO. 8712 

STRAIN 18RS21 frame: 1 

AEVSQERPAKTAVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYK VKT D I S V DE LKKL T T VE AAD AKVGT I LE E G V S L P QKTN AQG L W DAL D S KS N VR Y L Y 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNWTDEPKTDKDVK.LGQ 
D DAG YT I GEE FKW FLKS T I PANLG D YEKFE I T DKFADGLT YKS VGKIKI G SKT LNRDEH Y 
TIDE PT VDN QN T LK I T FK PE K FKE I AE L LKGMT L VKNQ D AL DKAT AN T D D AA F L E I P VA S 
TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAE FDLLAS DGTAVKWT DAL IKANTNKNYIAGEAVTGQPIKLKSHTDGTFE I KGLAYAVD 
ANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNK 
RPS 

SEQ ID NO. 8713 

STRAIN M7 32 frame: 1 

AEVSQERPAKTTVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLWDALDSKSNVRYLY 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNWTDEPKTDKDVKKLGQ 
DDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHY 
TIDEPTVDNQNTLKITFKPEKFKE IAELLKGMTLVKNQDALDKATANTDDAAFLEIPVAS 
TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAE FDLL AS D GT AVKWT D AL I KANTNKN Y I AGEAVTGQP I KLKSHT DGT FE I KGLAYAVD 
ANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNK 
RPS 

SEQ ID NO. 8714 

STRAIN M781 frame: 1 

AEVSQERPAKTAVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLWDALDSKSNVRYLY 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQ 
DDAGYT I GEE FKW FLKST I PANLGD YEKFE I T DKFADGLT YKS VGKIKIGSKTLNRDEHY 
TIDEPTVDNQNTLKITFKPEKFKE IAELLKGMTLVKNQDALDKATANTDDAAFLEIPVAS 
TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAE FDLLAS DGTAVKWT DALIKANTNKNYIAGEAVTGQP IKLKS HT DGT FEIKGLAYAVD 
ANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNK 
R 

SEQ ID NO. 8715 

STRAIN COH1 frame: 1 

AEVSQERPAKTAVNIYKLQADSYKSEITXNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLWDALDSKSNVRYLY 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQ 
DDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLT YKS VGKIKIGSKTLNRDEHY 
TIDEPTVDNQNTLKITFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAFLEIPVAS 
TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAEFDLLAS DGT AVPCWTDALIKANTNKNYI AGEAVTGQP IKLKSHTDGTFEIKGLAYAVD 
ANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNK 
RPS 

SEQ ID NO. 8716 

STRAIN CJB110 frame: 1 

AEVSQERPAKTAVNIYKLQADSYKLEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYKVKTDISVDELKKLTTVEAADAKVGTILEEGVSLPQKTNAQGLVVDALDSKSNVRYLY 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQ 
DDAGYTIGEEFKWFLKST I PANLGDYEKFEITDKFADGLT YKS VGKIKIGSKTLNRDEHY 
TIDEPTVDNQNTLKITFKPEKFKEIAELLKGMTLVKNQDALDKATANTDDAAFLEIPVAS 
TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAEFDLLAS DGT AVKWTDALIKANTNKNYI AGE AVTGQPIKLKSHTDGTFEIKGLAYAVD 
ANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNPKPTDITVDSADATPDTIKNNK 
RPS 
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SEQ ID NO. 8717 

STRAIN JM9130013 frame: 1 

AEVSQERPAKTAVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYKVKT D I S VDELKKLTTVEAADAKVGTI LEEGVS LPQKTNAQGLWDALDSKSNVRYLY 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQ 
DDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHY 
T I DE PT VDNQNT LKI T FKPEKFKE I AELLKGMTLVKNQDALDKAT ANT DDAAFLE I P VAS 
TINEKAVLGKAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAEFDLLASDGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGT FEIKGLAYAVD 
ANAEGTAVTYKLKETKAPEGWIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNNK 
RPS 

SEQ ID NO. 8718 

STRAIN A909 frame: 1 

AEVSQERPAKTTVNIYKLQADSYKSEITSNGGIENKDGEVISNYAKLGDNVKGLQGVQFK 
RYKVKT D I S VDE LKKLTTVE AADAKVGT I LEEGVS L PQKTNAQGLWDALDSKSNVRYL Y 
VEDLKNSPSNITKAYAVPFVLELPVANSTGTGFLSEINIYPKNVVTDEPKTDKDVKKLGQ 
DDAGYTIGEEFKWFLKSTIPANLGDYEKFEITDKFADGLTYKSVGKIKIGSKTLNRDEHY 
T I DE PT VDNQNT LKI T FKPEKFKE I AELLKGMTLVKNQDALDKAT ANT DDAAFLE IP VAS 
TINEKAVLGBCAIENTFELQYDHTPDKADNPKPSNPPRKPEVHTGGKRFVKKDSTETQTLG 
GAEFDLLAS DGTAVKWTDALIKANTNKNYIAGEAVTGQPIKLKSHTDGTFEIKGLAYAVD 
ANAEGTAVTYKLKETKAPEGYVIPDKEIEFTVSQTSYNTKPTDITVDSADATPDTIKNN 

SEQ ID NO. 8801 
STRAIN 2603 

AT G C CT AAG AAG AAAT C AG AT AC C C C AG AAAAAG AAGAAGT T GT C T T AACG G AAT G G C AA 
AAG C GT AAC C T T G AAT T T T T AAAAAAAC G C AAAG AAG AT G AAG AAG AAC AAAAAC G T AT T 
AACGAAAAATT ACGCTTAGAT AAAAGAAGT AAATT AAAT AT TTCT T CT CCTGAAGAAC CT 
C AAAAT ACT ACT AAAAT T AAGAAGCT T CATTT T CCAAAGATTT CAAGACCT AAGATTGAA 
AAG AAAC AG AAAAAAGAAAAAAT AGT C AAC AG C T T AGC C AAAAC T AAT C G CAT TAG AAC T 
GCACCTATATTTGTAGTAGCATTCCTAGTCATTTTAGTTTCCGTTTTCCTACTAACTCCT 
TT T AG T AAG C AAAAAAC AAT AAC AGT T AGT G G AAAT C AG CAT AC AC C T GAT GAT AT T T T G 
AT AGAG AAAACGAAT AT T CAAAAAAACGAT T AT TT CTT TT CTTT AAT TTTT AAAC AT AAA 
GC T AT T G AAC AAC G T T T AG CT G C AGAAG AT GT AT G G GT AAAAAC AG C T C AG AT G ACT TAT 
C AAT T T C C C AAT AAG T T T CAT AT T C AAGT T C AAGAAAAT AAG AT T AT T G CAT AT G C AC AT 
AC AAAG C AAGG AT AT C AAC CTGTCTTG G AAAC T G G AAAAAAGG C T GAT C C T GT AAAT AGT 
T C AG AG CT ACC AAAG C AC TTCT T AAC AAT T AAC CT T G AT AAGG AAG AT AGT AT T AAG C T A 
T T AAT T AAAGAT T T AAAGG C T T TAG AC C C T GAT T T AAT AAGT GAG AT T C AGG T GAT AAGT 
T T AG C T GAT T CT AAAAC G AC AC C T G AC CT C CT G C T G T TAG AT AT G C AC G ATG G AAAT AGT 
ATT AGAAT ACC AT T AT CTAAAT TT AAAGAAAGACT T CCTT T TT AC AAACAAATT AAG AAG 
AACCTTAAGGAACCTTCTATTGTTGATATGGAAGTGGGAGTTTACACAACAACAAATACC 
AT T G AAT C AAC C C CT GT T AAAG C AGAAG AT AC AAAAAAT AAAT C AACT GAT AAAAC AC AA 
AC AC AAAAT G G T C AG GT T G C G G AAAAT AGT C AAG GAC AAAC AAAT AAC T C AAAT ACT AAT 
CAACAAGGACAACAGATAGCAACAGAGCAGGCACCTAACCCTCAAAATGTTAAT 

SEQ ID NO. 8802 
STRAIN H36B 

C C T AAGAAG AAAT C AG AT AC C C C AG AAAAAG AAG AAG T T 
GTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGCAA 
AGAAG AT G AAG AAG AAC AAAAAC GT AT T AAC G AAAAAT T ACG C T T AG AT A 
AAAGAAGT AAAT T AAAT AT T T CTT CT CCTGAAGAAC CT C AAAAT ACT ACT 
AAAATT AAGAAG CTT CAT T T T C C AAAGAT T T C AAG AC C T AAG AT T G AAAA 
G AAAC AG AAAAAAGAAAAAAT AGT C AAC AG CTT AG C C AAAACT AAT C G C A 
TTAGAACTGCACCTATATTTGTAGTAGCATTCCTAGTCATTTTAGTTTCC 
GTTTTCCTACTAACTCCTTTTAGTAAGCAAAAAACAATAACAGTTAGTGG 
AAAT C AGC AT AC ACCT GAT GAT ATTTTGAT AGAGAAAACGAAT ATT CAAA 
AAAAC GATT ATT TCTTTTCTTTAATTTTT AAAC AT AAAGCT AT TGAACAA 
C GT T TAG C T G C AG AAG AT G T AT G G G T AAAAAC AG C T C AG AT G ACT TAT C A 
ATTT CC C AAT AAGTTT CAT ATT CAAGTT C AAGAAAAT AAGAT TAT T GCAT 
AT G C AC AT AC AAAG C AAGGAT AT C AAC CTGTCTTG G AAAC T GG AAAAAAG 
GCTGATCCTGTAAATAGTTCAGAGCTACCAAAGCACTTCTTAACAATTAA 
CCTT GAT AAGG AAG AT AGT AT T AAG C TAT T AAT T AAAG AT T T AAAG G C T T 
TAG AC C C T GAT T T AAT AAG T GAG AT T CAGGT GAT AAGT T T AGCT GAT T CT 
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AAAACGACACCTGACCTCCTGCTGTTAGATATGCACGATGGAAATAGTAT 
T AG AAT AC CAT TAT C T AAAT T T AAAG AAAG ACT T C C T T T T T AC AAAC AAA 
T T AAG AAG AAC C T T AAGG AAC C TT C TAT T GT T GAT AT G GAAGT G G G AGT T 
T AC AC AAC AAC AAAT AC CAT T G AAT C AAC C C CT GT T AAAG C AGAAGAT AC 
AAAAAAT AAAT C AAC T G AT AAAAC AC AAAC AC AAAAT GGT C AG GT T G C GG 
AAAAT AGT CAAGGAC AAAC AAAT AAC T C AAAT AC T AAT C AAC AAG G AC AA 
C AGAT AG C AAC AGAG C AGG C AC C T AAC C C T C AAAAT GT T AAT 

SEQ ID NO. 8803 
STRAIN 18RS21 

C C T AAG AAG AAAT C AG AT AC C C C AGAAAAAG AAGAAGT T 
GTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGCAA 
AGAAGAT G AAG AAG AAC AAAAAC GT AT T AAC G AAAAAT T ACGC T T AGAT A 
AAAG AAGT AAAT T AAAT AT TTCTTCTCCT G AAG AAC C T C AAAAT AC TACT 
AAAAT T AAG AAG C T T CAT T T T C C AAAG AT T T C AAG AC CT AAG AT T GAAAA 
G AAAC AG AAAAAAGAAAAAAT AGT C AAC AG C T T AGC C AAAAC T AAT C GC A 
T TAG AAC T G C AC C TAT AT T T GT AGT AGC AT T C C T AG T CAT T T T AGT T T C C 
GTTTTCCTACTAACTCCTTTTAGTAAGCAAAAAACAATAACAGTTAGTGG 
AAAT C AG CAT AC AC CT GAT GAT AT T T T GAT AGAGAAAACGAAT AT T C AAA 
AAAACGATTATTTCTTTTCTTTAATTTTTAAACATAAAGCTATTGAACAA 
C GT T TAG C T G C AGAAGAT GT AT GGGT AAAAAC AG CT C AGAT G AC T TAT C A 
AT T T C C C AAT AAGT T T CAT AT T C AAGT T CAAG AAAAT AAGAT TAT T GC AT 
AT G C AC AT AC AAAG C AAGGAT AT C AAC C T GT CT T GGAAAC T GGAAAAAAG 
GC T GAT C C T GT AAAT AG T T C AG AGCT AC C AAAGC AC T T C T T AAC AAT T AA 
CCTTGATAAGGAAGATAGTATTAAGCTATTAATTAAAGATTTAAAGGCTT 
TAGACCCTGATTTAATAAGTGAGATTCAGGTGATAAGTTTAGCTGATTCT 
AAAACGACACCTGACCTCCTGCTGTTAGATATGCACGATGGAAATAGTAT 
TAG AAT AC CAT TAT C T AAAT T T AAAGAAAGACT T C CT T T T T AC AAAC AAA 
TTAAGAAGAACCTTAAGGAACCTTCTATTGTTGATATGGAAGTGGGAGTT 
T AC AC AAC AAC AAAT AC CAT T GAAT C AAC C C C T GT T AAAG C AG AAGAT AC 
AAAAAAT AAAT CAACT GAT AAAAC AC AAAC AC AAAAT GGT CAGGT T GCGG 
AAAAT AG T C AAGG AC AAAC AAAT AAC TC AAAT ACT AAT C AAC AAGG AC AA 
C AG AT AG C AAC AG AG C AG G C AC CT AAC C CT C AAAAT GT T AAT 

SEQ ID NO. 8804 
STRAIN M732 

C C T AAG AAG AAAT C AGAT ACCC C AGAAAAAG AAG AAG 
TTGTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGC 
AAAG AAG AT GAAG AAG AAC AAAAAC GT AT T AAC GAAAAAT T AC G CT TAG A 
TAAAAG AAGT AAATT AAAT AT T T CT T CT C CT GAAG AAC CT C AAAAT ACT A 
CT AAAAT T AAG AAGC T T CAT T T T C C AAAG AT T T C AAAAC C T AAG AT T G AA 
AAG AAAC AGAAAAAAG AAAAAAT AGT C AAC AG C T TAG C C AAAACT AAT CG 
CATTAGAACTGCACCTATATTTGTAGTAGCATTCCTAGTCATTTTAGTTT 
C C GT T T T C CT AC T AACT C CT T T T AGT AAG CAAAAAAC AAT AAC AGT T AGT 
GGAAAT CAGC AT AC ACCT GAT GAT AT T TT GAT AGAAAAAACGAAT ATT CA 
AAAAAACGATTATTTCTTTTCTTTAATTTTTAAACATAAAGCTATTGAAC 
AAC G T T T AGC T G C AG AAG AT GT AT G G GT AAAAAC AG CT C AG AT G ACT TAT 
C AAT T T C C C AAT AAGT T T CAT AT T C AAGT T CAAG AAAAT AAG AT TAT T G C 
AT AT GC AC AT AC AAAG C AAGG AT AT C AG C C T GT C T T GGAAAC T G G AAAAA 
AG G C T GAT C CT G T AAAT AGT T C AG AG C T AC C AAAG C AC T T C T T AAC AAT T 
AAC C T T GAT AAGG AAG AT AG TAT T AAG CT AT T AAT T AAAG AT T T AAAG G C 
T TT AG AC C C T GAT T T AAT AAGT GAG AT T C AG GT GAT AAGT T T AG CT GAT T 
C T AAAAC G AC AC C T G AC CTCCTGCTG T TAG AT AT G CAT GAT GGAAAT AGT 
ATTAGAATACCATTATCTAAATTTAAAGAAAGACTTCCTTTTTACAAACA 
AAT T AAG AAG AAC CT T AAGG AAC C T T CT AT T GT T GAT AT GG AAGT G GG AG 
TTTACAC AAC AACAAGTACTAT T GAAT C AAC C C CTGT GAAAGCGG AAGAT 
AC AAAAAAT AAAT C AAC T GAT AAAAC AC AAAC AC AAAAT GGT C AG GT T G C 
GG AAAAT AG T CAAGGAC AAAC AAAT AAC T C AAAT AC T AAT C AAC AAGGAC 
AAC AG AT AG C AAC AG AG C AG G C AC C C AAC C C T C AAAAT GT T AAT 

SEQ ID NO. 8805 
STRAIN COH1 

C C T AAG AAG AAAT C AG AT AC C C C AGAAAAAG AAG AAGT T 
GTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGCAA 
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AG AAG AT GAAGAAGAAC AAAAAC G T AT T AAC G AAAAAT T AC G CT T AGAT A 
AAAGAAGT AAAT T AAAT AT TTCTTCTCCT G AAG AAC C T C AAAAT AC T AC T 
AAAAT T AAGAAGCT T CAT T T T C C AAAG AT T T C AAAAC C T AAGAT T G AAAA 
GAAACAGAAAAAAGAAAAAATAGT CAACAGCTT AGC CAAAACT AATCGCA 
T T AGAAC TG CAC CT AT AT T T GT AGT AG CAT T C C T AGT C AT T T T AGT T T CC 
GT T T T C CT AC T AAC T C C T T T TAG T AAGC AAAAAACAAT AAC AGT T AGT GG 
AAAT C AG CAT AC AC CT GAT GAT AT T T T GAT AG AAAAAAC G AAT AT T CAAA 
AAAAC GAT TAT TTCTTTTCTT T AAT T T T T AAAC AT AAAG CT AT T GAAC AA 
CGTTTAGCTGCAGAAGATGTATGGGTAAAAACAGCTCAGATGACTTATCA 
AT T T C C C AAT AAGT T T CAT AT T C AAGT T C AAG AAAAT AAG AT TAT T G CAT 
AT G C AC AT AC AAAGC AAGG AT AT C AG C CTGTCTTG G AAAC T GG AAAAAAG 
GC T GAT C C T GT AAAT AGT T C AG AG C T AC C AAAGC ACT T CT T AAC AAT T AA 
C C T T GAT AAGG AAGAT AGT AT T AAG C TAT T AAT T AAAG AT T T AAAGG C T T 
TAG AC C C T GAT T T AAT AAG T G AGAT T C AGGT GAT AAGT T TAG C T GAT T CT 
AAAAC G AC AC C T G AC CTCCTGCTGT TAG AT AT G CAT GAT G G AAAT AG TAT 
TAG AAT AC CAT TAT C T AAAT T T AAAGAAAGAC TTCCTTTT T AC AAAC AAA 
TTAAGAAGAACCTTAAGGAACCTTCTATTGTTGATATGGAAGTGGGAGTT 
T AC AC AAC AAC AAGT AC TAT T G AAT CAAC C C C T GT G AAAG CGG AAG AT AC 
AAAAAAT AAAT CAACT GAT AAAAC AC AAAC AC AAAAT GG T C AGGT T G CGG 
AAAAT AGT CAAGGAC AAAC AAAT AACT C AAAT ACTAAT CAACAAGGACAA 
C AG AT AG C AAC AGAG CAG G CAC C CAAC C C T C AAAAT GT T AAT 

SEQ ID NO. 8806 
STRAIN M7 81 

C C T AAGAAG AAAT CAG AT AC CC C AG AAAAAG AAG AAG 
TTGTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGC 
AAAG AAG AT G AAG AAG AAC AAAAAC GT AT T AAC GAAAAAT T AC GC T T AG A 
T AAAAGAAG T AAAT T AAAT AT TTCTTCTCCT GAAG AAC CT C AAAAT AC T A 
C T AAAAT T AAG AAG C T T CAT T T T C C AAAG AT T T C AAAAC C T AAG AT T G AA 
AAGAAAC AG AAAAAAG AAAAAAT AGT CAAC AG C T TAG C C AAAAC T AAT CG 
CAT TAG AACT G CAC C TAT AT T T GT AG T AGC AT T C C T AGT CAT T T T AGT T T 
CCGTTTTC C TACT AAC T C C T T T T AGT AAG C AAAAAAC AAT AAC AG T T AGT 
G GAAAT CAG CAT AC AC C T GAT GAT AT T T T GAT AG AAAAAAC G AAT AT T C A 
AAAAAACGATTATTTCTTTTCTTTAATTTTTAAACATAAAGCT ATT GAAC 
AAC GT T TAG C T G C AGAAG AT G TAT G G G T AAAAAC AG CT C AGAT GAC T TAT 
C AAT T T C C C AAT AAGT T T CAT AT T C AAG T T C AAG AAAAT AAGAT TAT T G C 
AT AT G CAC AT AC AAAG C AAGG AT AT CAG C C T GT C T T GG AAACT G G AAAAA 
AG G C T GAT C C T GT AAAT AG T T CAG AGC T AC C AAAG CAC T T C T T AAC AAT T 
AAC C T T G AT AAGGAAG AT AGT ATT AAG C TAT T AAT T AAAG AT T T AAAGG C 
TTTAGACCCTGATTTAATAAGTGAGATTCAGGTGATAAGTTTAGCTGATT 
CT AAAAC GAC AC CT G AC C T C C T G C T GT T AG AT AT G CAT G AT GG AAAT AGT 
ATTAGAATACCATTATCTAAATTTAAAGAAAGACTTCCTTTTTACAAACA 
AAT T AAG AAGAAC C T T AAG GAAC C T T C T AT T GT T G AT AT GG AAGT GGG AG 
T T T AC AC AAC AAC AAG TACT AT T G AAT CAAC C C C T GT GAAAGC G G AAGAT 
ACAAAAAATAAATCAACTGATAAAACACAAACACAAAATGGTCAGGTTGC 
GGAAAATAGT CAAGGACAAACAAATAACT C AAATACTAAT C AACAAGGAC 
AAC AG AT AG CAAC AG AGC AG GC AC C C AAC C C T C AAAAT GT T AAT 

SEQ ID NO. 8807 
STRAIN CJB110 

C C T AAG AAGAAAT CAG AT AC CC CAG AAAAAG AAG AAG 
TTGTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGC 
AAAG AAG AT GAAG AAGAAC AAAAAC G TAT T AAC G AAAAAT T ACG CT T AG A 
TAAAAG AAGT AAAT T AAAT AT TTCTTCTCCT G AAGAAC C T C AAAAT AC T A 
C T AAAAT T AAG AAG C T T CAT T T T C C AAAG AT T T C AAAAC C T AAG AT T G AA 
AAG AAAC AG AAAAAAGAAAAAAT AGT CAAC AG C T TAG C C AAAAC T AAT CG 
CATTAGAACTGCACCTATATTTGTAGTAGCATTCCTAGTCATTTTAGTTT 
CCGTTTTCCTACTAACTCCTTTTAGTAAGCAAAAAACAATAACAGTTAGT 
GG AAAT CAG CAT AC AC CT G AT GAT AT T T T GAT AG AAAAAAC G AAT AT T C A 

AAAAAACGATTATTTCTTTTCTTTAATTTTTAAACATAAAGCTATTGAAC 
AAC G T T T AG C T G CAG AAG AT G T AT G GG T AAAAAC AG C T CAG AT G ACT TAT 
C AAT T T C C C AAT AAG T T T CAT AT T C AAGT T C AAG AAAAT AAG AT TAT T G C 
AT AT G CAC AT AC AAAGC AAG GAT AT CAG CCTGTCTTG GAAACT G G AAAAA 
AG G C T GAT C C T GT AAAT AGT T C AGAG C T AC C AAAGC AC T T C T T AAC AAT T 
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AACCTTGATAAGGAAGATAGTATTAAGCTATTAATTAAAGATTTAAAGGC 
T T T AGAC C C T GAT T T AAT AAGT G AG AT T C AGGT G AT AAGT T T AGC T G ATT 
C T AAAAC GAC AC C T G AC CT CCTGCTGT T AGAT AT GC AT GAT GG AAAT AGT 
ATTAGAATACCATTATCTAAATTTAAAGAAAGACTTCCTTTTTACAAACA 
AAT T AAG AAG AAC C T T AAGG AAC C T T C TAT TGT T GAT AT GG AAGT GGG AG 
T T T AC AC AAC AAC AAGT AC TAT T GAAT C AACC C CT GT GAAAG C G G AAG AT 
AC AAAAAAT AAAT C AAC T GAT AAAACAC AAAC AC AAAAT GGT C AGGT T GC 
G GAAAAT AGT C AAGGAC AAAC AAAT AACT C AAAT AC T AAT C AAC AAG GAC 
AAC AG AT AG C AAC AG AG C AGG C AC C CAAC CC T C AAAAT GT T AAT 

SEQ ID NO. 8808 
STRAIN 1169NT 

C CT AAGAAG AAAT C AGAT AC C C C AGAAAAAG AAGAAGT 
TGTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGCA 
AAGAAG AT G AAG AAG AAC AAAAAC GT AT T AAC G AAAAAT T AC G CT TAG AT 
AAAAG AAGT AAAT T AAAT AT TTCTTCTCCT GAAG AAC CT C AAAAT AC T AC 
T AAAAT T AAG AAG C T T CAT T T T C C AAAG AT T T C AAAAC C T AAG AT T GAAA 
AGAAACAGAAAAAAGAAAAAATAGTCAACAGCTTAGCCAAAACTAATCGC 
AT TAG AAC T G C AC C TAT AT T TAT AGT AG CAT T C C T AGT CAT T T T AGT T T C 
CGTTTTCC T AC T AAC T C CT T T T AGT AAGC AAAAAAC AAT AAC AGT T AGT G 
G AAAT C AGC AT AC AC C T GAT GAT AT T T T GAT AG AG AAAACG AAT AT T C AA 
AAAAAC GAT TAT TTCTTTTCTT T AAT T T T T AAAC AT AAAG C TAT T G AAC A 
AC GT T TAG CT G C AG AAGAT G TAT GG GT AAAAAC AG CT C AG AT GACTT AT C 
AAT T T C C CAAC AAG T T T CAT AT T C AAGT T C AAG AAAAT AAG AT TAT T G C A 
T A t G C AC AT AC AAAG C AAG GAT AT C AG C C T G T C T T GG AAAC T G G AAAAAA 
GG C T GAT C CT GT AAAT AGT T C AG AG CT AC C AAAGC AC T T C T T AAC AAT T A 
AC C T T GAT AAG GAAG AT AG TAT T AAG C T AT T AAT T AAAG AT T T AAAGGC T 
T TAG AC C C T GAT T T AAT AAGT GAG AT T C AGGT GAT AAG T T TAG C T GAT T C 
T AAAAC GAC AC C T GAC C TCCTGCTGT TAG AT AT G C AC GAT G G AAAT AG T A 
T TAG AAT AC CAT TAT C T AAAT T T AAAG AAAG AC TTCCTTTT T AC AAAC AA 
AT T AAG AAG AAC C T T AAG G AAC C T T CT AT T G T T GAT AT GG AAGT G G GAGT 
T T AC AC AAC AAC AAGT AC TAT T GAAT CAAC C C C T GT GAAAG CG GAAG AT A 
C AAAAAAT AAAT CAAC T GAT AAAAC AC AAAC C C AAAAT GGT C AGGT T G C G 
GAAAATAGT C AAGG AC AAAC AAAT AACT C AAAT ACT AAT CAAC AAGG AC A 
AC AAC AGAT AG C AAC G G AG C AGG C AC C CAAC C C T C AAAAT GT T AAT 

SEQ ID NO. 8809 
STRAIN JM9130013 

C C T AAG AAG AAAT C AG AT AC C C C AGAAAAAG AAGAAG T T 
GTCTTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTAAAAAAACGCAA 
AGAAG AT GAAG AAG AAC AAAAAC GT AT T AAC G AAAAAT T AC G C T T AGAT A 
AAAG AAG T AAAT T AAAT ATT T CT T CT C C T GAAG AAC C T C AAAAT AC TACT 
AAAA T T AAG AAG C T T CAT T T T C C AAAG AT T T C AAG A C C T AAGAT T G AAAA 
G AAAC AG AAAAAAG AAAAAAT AG T CAAC AG C T T AG C C AAAACT AAT C GC A 
TTAGAACTGCACCTATATTTGTAGTAGCATTCCTAGTCATTTTAGTTTCC 
GTTTTCCTACTAACTCCTTTT AGT AAGC AAAAAAC AAT AAC AGTTAGTGG 
AAAT C AG CAT AC AC CT GAT GAT AT T T T G AT AG AGAAAAC GAAT AT T C AAA 
AAAACG AT TAT TTCTTTTCTT T AAT T T T T AAAC AT AAAG C TAT T G AAC AA 
CGTTTAGCTGCAGAAGATGTATGGGTAAAAACAGCTCAGATGACTTATCA 
AT T T C C C AAT AAG T T T CAT AT T C AAG T T C AAGAAAAT AAG AT TAT T G CAT 
AT G C AC AT AC AAAGC AAGG AT AT CAAC CTGTCTTG GAAAC T G G AAAAAAG 
G CT GAT C C T G T AAAT AG T T C AG AG C T AC C AAAG C AC TT C T T AAC AAT T AA 
C C T T GAT AAGG AAG AT AGT AT T AAG C TAT T AAT T AAAGAT T T AAAG G CT T • 
TAG AC C C T GAT T T AAT AAGT GAG AT T C AGGT GAT AAGT T T AG CT G AT T CT 
AAAAC GAC AC C T GAC CTCCTGCTGT TAG AT AT G C AC GAT G G AAAT AG TAT 
TAG AAT AC CAT TAT C T AAAT T T AAAG AAAG AC TTCCTTTT T AC AAAC AAA 
T T AAG AAG AAC C T T AAGG AAC C T T CT AT T GT T GAT AT G G AAGT GG GAGT T 
TAG AC AAC AAC AAAT AC CAT T GAAT CAAC C C C T GT T AAAG C AG AAG AT AC 
AAAAAAT AAAT CAAC T GAT AAAAC AC AAAC AC AAAAT GGT C AG GT T G CGG 
AAAAT AG T C AAG GAC AAAC AAAT AAC T C AAAT AC T AAT CAAC AAG GAC AA 
C AG AT AG CAAC AG AG C AG G C AC C T AAC C CT C AAAAT G T T AAT 

SEQ ID NO. 8810 
STRAIN A909 
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CCT AAG AAG AAAT C AG AT AC C C C AG AAAAAGAAG AAGT T GT C 
TTAACGGAATGGCAAAAGCGTAACCTTGAATTTTTaaAAAAACGCAAAGA 
AGAT GAAGAAGAa CAAAAACGT ATT AACGAAAAAT TACGCTTAGATAAAA 
GAAGT AAAT T AAAT AT TTCTTCTCCT GAAGAAC CT C AAAAT ACT AC T AAA 
AT T AAGAAG C T T CAT T T T C C AAAGAT T T C AAGAC C T AAGAT T G AAAAG AA 
AC AG AAAAAAG AAAAAAT AGT C AAC AGC T TAG C CAAAACT AAT C G C AT T A 
G AACT GC AC C TAT AT T T GT AG TAG CAT T C C T AGT CAT T T T AGTT T C C G T T 
TTCCTACTAACTCCTTTTAGTAAGCAAAAAACAATAACAGTTAGTGGAAA 
T CAGCAT ACAC CT GAT GAT AT T T T G AT AGAG AAAAC G AAT AT T C AAAAAA 
ACGATTATTTCTTTTCTTTAATTTTTAAACATAAAGCTATTGAACAACGT 
TTAGCTGCAGAAGATGTATGGGTAAAAACAGCTCAGATGACTTATCAATT 
T C C C AAT AAGT T T CAT AT T C AAGT T CAAGAAAAT AAGAT TAT T G CAT AT G 
C AC AT AC AAAGC AAGG AT AT C AAC CT G T CT T GG AAACT G G AAAAAAG G C T 
GAT C C T GT AAAT AGT T CAGAG CT AC C AAAG C AC T T C T T AAC AAT T AAC C T 
TGATAAGGAAGATAGTATTAAGCTATTAATTAAAGATTTAAAGGCTTTAG 
AC C C T GAT T T AAT AAGT GAG AT T C AGG T GAT AAGT T TAG C T GAT T C T AAA 
ACGACACCTGACCTCCTGCTGTTAGATATGCACGATGGAAATAGTATTAs 
, AAT AC CAT TAT C T AAAT T T AAAG AAAGACT T C C T T T T T AC AAAC AAAT T A 
AG AAG AAC CT T AAGGAAC C T T CT AT T GT T GAT AT GGAAGT GGGAGT T T AC 
AC AAC AAC AAAT AC CAT T G AAT C AAC C C C T G T T AAAGC AG AAGAT AC AAA 
AAAT AAAT C AAC T GAT AAAAC AC AAmC AC AAAAT G GT C AGGT T G C G G AAA 
AT AGT CAAGGACAAAC AAAT AACT C AAAT ACT AAT C AAC AAGG AC AAC AG 
AT AG C AAC AGAGC AGGC ACCT AAC CCT C AAAAT GT T AAT 

SEQ ID NO. 8811 
STRAIN 090 

T AAG AAG AAAT C AG AT AC C C C AGAAAAAG AAG AAG T T G T C T T AACGGAAT 
GG C AAAAG C GT AAC C T T G AAT T T T T AAAAAAAC G C AAAG AAG AT GAAG AA 
GAAC AAAAACGT AT T AAC GAAAAAT T AC G C T TAG AT AAAAG AAG T a a aT T 
AAAT AT T T C T T C T C CT GAAGAAC C T C AAAAT AC T AC T AAAAT T AAG AAGC 
T T CAT T T T C C AAAG AT T T C AAAAC C T AAG AT T G AAAAGAAAC AGAAAAAA 
GAAAAAAT AGT C AAC AGC T TAG C CAAAACT AAT C G CAT TAG AAC T G C AC C 
TATATTTGTAGTAGCATTCCTAGTCATTTTAGTTTCCGTTTTCCTACTAA 
C T C C T T T T AGT AAGC AAAAAAC AAT AAC AGT T AGT G GAAAT C AG CAT AC A 
CCT GAT GAT AT T T T GAT AGAAAAAAC G AAT AT T C AAAAAAAC GAT TAT T T 
CTTTTCTTTAATTTTTAAACATAAAGCTATTGAACAACGTTTAGCTGCAG 
AAGATGTATGGGTAAAAACAGCTCAGATGACTTATCAATTTCCCAATAAG 
T T T CAT AT T C AAGT T CAAGAAAAT AAG AT TAT T GC AT AT G C AC AT AC AAA 
G C AAGG AT AT C AG CCTGTCTT GG AAAC T G G AAAAAAGG C T GAT CCT GT AA 
AT AG T T C AG AG CT AC C AAAG C AC T T C T T AAC AAT T AAC CT T G AT AAGGAA 
GAT AGT AT T AAG C T AT T AAT T AAAG AT T T AAAG G C T T TAG AC CCT GAT T T 
AAT AAGT GAG AT T C AG GT GAT AAGT T T AG C T GAT T C T AAAAC G AC AC C T G 
ACCTCCTGCTGTTAGATATGCATGATGGAAATAGTATTAGAATACCATTA 
T C T AAAT T T AAAG AAAG AC TTCCTTTT T AC AAAC AAAT T AAG AAG AAC C T 
T AAGG AAC CT T CT AT T GT T GAT AT G GAAGT G G GAG T T T AC AC AAC AAC AA 
GT AC TAT T G AAT C AAC C C C T GT G AAAG C GGAAG AT AC AAAAAAT AAAT C A 
AC T GAT AAAAC AC AAAC AC AAAAT GG T C AG G T T G C GG AAAAT AGT C AAG G 
AC AAAC AAAT AAC T C AAAT ACT AAT C AAC AAG G AC AAC AG AT AG C AAC AG 
AG C AG G C AC C C AAC CCT C AAAAT GT T AAT 

SEQ ID NO. 8812 
STRAIN 2 603 frame: 1 

PKKKS DT PEKEE VVLTE WQKRNLE FLKKRKE DEEE QKRINEKLRLDKRS KLNISSPEEPQ 
NTTKIKKLHFPKISRPKIEKKQKKEKIVNSLAKTNRIRTAPIFVVAFLVILVSVFLLTPF 
SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 
FPNKFHIQVQENKI I AYAHTKQGYQPVLETGKKADPVNS SEL PKHFLT INLDKE DSIKLL 
IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 
LKEPSIVDMEVGVYTTTNTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
QGQQIATEQAPNPQNVN 

SEQ ID NO. 8813 

STRAIN H3 6B frame: 1 

PKKKS DTPEKEEWLTEWQKRNLEFLKKRKEDEEEQKRINEKLRLDKRSKLNISSPEEPQ 
NTTKIKKLHFPKISRPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPF 
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SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 
FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 
IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 
LKEPSIVDMEVGVYTTTNTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
QGQQIATEQAPNPQNVN 

SEQ ID NO. 8814 

STRAIN 18RS21 frame: 1 

PKKKSDTPEKEEWLTEWQKRNLEFLKKRKEDEEEQKRINEKLRLDKRSKLNISSPEEPQ 
NTTKIKKLHFPKISRPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPF 
SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 
FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 
IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 
LKEPSIVDMEVGVYTTTNTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
QGQQIATEQAPNPQNVN 

SEQ ID NO. 8815 

STRAIN M732 frame: 1 

PKKKS DT PEKEE WLTE WQKRNLE FLKKRKE DEEEQKRINEKLRLDKRSKLNI S S PEE PQ 
NTTKIKKLHFPKISKPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPF 
SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 
FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 
IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 
LKEPSIVDMEVGVYTTTSTIESTPVPCAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
QGQQIATEQAPNPQNVN 

SEQ ID NO. 8816 

STRAIN COH1 frame: 1 

PKKKS DTPEKEEWLTEWQKRNLEFLKKRKEDEEEQKRINEKLRLDKRSKLNISS PEE PQ 
NTTKIKKLHFPKISKPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPF 
SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 
FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 
IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 
LKEPSIVDMEVGVYTTTSTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
QGQQIATEQAPNPQNVN 

SEQ ID NO. 8817 

STRAIN M781 frame: 1 

PKKKSDTPEKEEWLTEWQKRNLEFLKKRKEDEEEQKRINEKLRLDKRSKLNISSPEEPQ 
NTTKIKKLHFPKISKPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPF 
SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 
FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 
IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 
LKE PS I VDMEVGVYTTT ST IE ST PVKAEDTKNKST DKTQTQNGQVAENS QGQTNNSNTNQ 
QGQQIATEQAPNPQNVN 

SEQ ID NO. 8818 

STRAIN CJB110 frame: 1 

PKKKS DTPEKEEVVLTEWQKRNLEFLKKRKEDEEEQKRINEKLRLDKRSKLNISSPEEPQ 
NTTKIKKLHFPKISKPKIEKKQKKEKIVNSLAKTNRIRTAPIFVVAFLVILVSVFLLTPF 
SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 
FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 
IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 
LKEPSIVDMEVGVYTTTSTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 
QGQQIATEQAPNPQNVN 

SEQ ID NO. 8819 

STRAIN 1169NT frame: 1 

PKKKS DTPEKEEVVLTEWQKRNLEFLKKRKEDEEEQKRINEKLRLDKRSKLNISSPEEPQ 
NTTKIKKLHFPKISKPKIEKKQKKEKIVNSLAKTNRIRTAPIFIVAFLVILVSVFLLTPF 
SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 
FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLL 
IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 
LKEPSIVDMEVGVYTTTSTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQ 



389 



WO 2004/018646 



PCT/US2003/026827 



SEQUENCE LISTING 



QGQQQIATEQAPNPQNVN 



SEQ ID NO. 8820 

STRAIN JM9130013 frame: 1 

PKKKS DT PEKEE WLTE WQKRNLE FLKKRKE DEEEQKRINEKLRLDKRSKLNI S S PEE PQ 
NTTKIKKLHFPKISRPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPF 
SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 
FPNKFH I QVQENKI I AYAHTKQG YQ P VLE T GKKAD P VN S S E L PKH FLT I NL DKE D S I KLL 

IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKN 

LKEPSIVDMEVGVYTTTNTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNO 
QGQQ I ATE QAPN PQN VN 

SEQ ID NO. 8821 

STRAIN A90 9 frame: 1 

PKKKS DT PEKEE WLTE WQKRNLE FLKKRKEDEEEQKRINEKLRLDKRS KLNI S S PEE PQ 

NTTKIKKLHFPKISRPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPF 

SKQKTITVSGNQHTPDDILIEKTNIQKNDYFFSLIFKHKAIEQRLAAEDVWVKTAQMTYQ 

FPNKFHIQVQENKIIAYAHTKQGYQPVLETGKPCADPVNSSELPKHFLTINLDKEDSIKLL 

IKDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIXIPLSKFKERLPFYKQIKKN 

LKEPSIVDMEVGVYTTTNTIESTPVKAEDTKNKSTDKTQXQNGQVAENSQGQTNNSNTNO 
QGQQ I ATE QAPN PQN VN 

SEQ ID NO. 8822 

STRAIN 0 90 frame: 2 

KKKSDTPEKEEWLTEWQKRNLEFLKKRKEDEEEQKRINEKLRLDKRSKLNISSPEEPQN 
TTKIKKLHFPKISKPKIEKKQKKEKIVNSLAKTNRIRTAPIFWAFLVILVSVFLLTPFS 
KQKT I T V S GN QH TPDDILIEKTNI QKN D Y F FS L I FKHKAI E QR L AAE D VW VKT AQMT YQ F 

PNKFHIQVQENKIIAYAHTKQGYQPVLETGKKADPVNSSELPKHFLTINLDKEDSIKLLI 

KDLKALDPDLISEIQVISLADSKTTPDLLLLDMHDGNSIRIPLSKFKERLPFYKQIKKNL 

KEPSIVDMEVGVYTTTSTIESTPVKAEDTKNKSTDKTQTQNGQVAENSQGQTNNSNTNQO 
GQQIATEQAPNPQNVN 

SEQ ID NO. 8901 
STRAIN 2603 

AT G AAAAAAGG AC AAG T AAAT GAT ACT AAG C AAT C T T AC T C T C T AC GT AAA 

TATAAATTTGGTTTAGCATCAGTAATTTTAGGGTCATTCATAATGGTCACAAGTCCTGTT 
T T T G CGGAT C AAAC T AC AT C GGT T C AAG T T AAT AAT C AG AC AG GC AC T AGT GT GGAT G C T 
AAT AAT T CT T C C AAT GAG AC AAG T G C GT C AAGT G T GAT T AC T T C C AAT AAT G AT AGT GT T 
C AAG C GT C T G AT AAAGT T G T AAAT AG T C AAAAT AC GG C AAC AAAG GAC AT TACT ACT C C T 
TTAGTAGAGACAAAGCCAATGGTGGAAAAAACATTACCTGAACAAGGGAATTATGTTTAT 
AGC AAAGAAAC CG AGG T G AAAAAT AC AC C T T C AAAAT C AG C C C C AG T AGC T T T C TAT G C A 
AAG AAAG GT GAT AAAGT T T T C T AT GAC C AAGT AT T T AAT AAAGAT AAT GT GAAAT G GAT T 
T CAT AT AAGT CTTTTTGT GGC GT AC G T C GAT AC G C AGC TAT T G AGT C AC T AGAT C CAT C A 
G G AG GT T C AG AG AC T AAAGC AC CT AC T C C T GT AAC AAAT T C AG G AAG C AAT AAT C AAGAG 
AAAAT AG C AAC G C AAG GAAAT TAT AC AT T T T C AC AT AAAGT AG AAGT AAAAAAT GAAG C T 
AAGG TAG C GAG T C C AACT C AAT T T AC AT T G GAC AAAG GAG AC AG AAT T T T T T ACG AC C AA 
AT AC T AAC TAT T GAAGG AAAT C AG T GGT TAT C T TAT AAAT CAT T C AAT GGTGTTCGTCGT 
TTTGTTTTG C TAG GT AAAG CAT C T T C AGT AG AAAAAAC T G AAG AT AAAGAAAAAGT GT C T 
C C T C AAC C AC AAG C C C GT AT T AC T AAAAC T G GT AGAC T GAC TAT T T C T AAC GAAAC AAC T 
AC AGGT T T T GAT AT T T T AAT T AC G AAT AT T AAAG AT GAT AAC GGT AT CGCTGCTG T T AAG 
GTACCGGTTTG GAC T G AAC AAG GAG G G C AAG AT GAT AT T AAAT G G TAT AC AG C T GT AAC T 
AC T G G G GAT G G C AAC T AC AAAGT AG C T GT AT CAT T T G C T GAC CAT AAG AAT GAG AAG GGT 
C T T TAT AAT AT T CAT T TAT AC T AC C AAG AAG C TAG T GGG AC AC T T G T AGGT G T AAC AG G A 
AC T AAAGT GAC AGT AG CT G G AAC T AAT T C T T C T C AAG AAC CT AT T G AAAAT GG T T T AG C A 
AAG AC T G GT GT T T AT AAT AT TAT C GG AAGT ACT GAAGT AAAAAAT GAAG C T AAAAT AT C A 
AGT C AG AC C C AAT T T AC T T T AG AAAAAG G T GAC AAAAT AAAT TAT GAT C AAGT AT T GAC A 

GCAGATGGTTACCAGTGGATTTCTTACAAATCTTATAGTGGTGTTCGTCGCTATATTCCT 
G T G AAAAAGC T AACT AC AAG T AGT G AAAAAG C G AAAG AT G AGG C GAC T AAAC C G AC T AGT 

TATCCCAACTTACCTAAAACAGGTACCTATACATTTACTAAAACTGTAGATGTGAAAAGT 
C AAC CT AAAGT AT C AAGT C C AGT G G AAT T T AAT T T T CAAAAG GGT G AAAAAAT AC AT TAT 
GAT C AAGT GT T AG T AGT AG AT GGT CAT C AG T G GAT T T CAT AC AAG AG T T AT T C C GG T AT T 
CGT CGCT ATATTGAAATT 

SEQ ID NO. 8902 
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STRAIN 090 

AAAAAAG GAC AAGT AAAT GAT AC T AAG C AAT CT T AC T 
CT C T AC GT AAAT AT AAAT T T GGT T T AGC AT C AGT AAT T T T AGGGT CAT T C 
ATAATGGTCACAAGTCCTGTTTTTGCGGATCAAACTACATCGGTTCAAGT 
T AAT AAT C AG AC AGG C AC T AGT GT GGAT G CT AAT AAT T C T T C C AAT GAGA 
CAAGTGCGTCAAGTGTGATTACTTCCAATAATGATAGTGTTCAAGCGTCT 
G AT AAAGT T GT AAAT AGT C AAAAT AC G G C AAC AAAG GAC AT TACT ACT C C 
T T T AGT AG AGAC AAAG C C AAT G GT GG AAAAAAC AT TAG CT GAACAAG GGA 
ATTATGTTTATAGCAAAGAAACCGAGGTGAAAAATACACCTTCAAAATCA 
GCCCCAGTAGCTTTCTATGCAAAGAAAGGTGATAAAGTTTTCTATGACCA 
AGT AT T T AAT AAAG AT AAT G T GAAAT GGAT T T CAT AT AAGT CTTTTTGTG 
GCGTACGTCGATACGCAGCTATTGAGTCACTAGATCCATCAGGAGGTTCA 
G AGACT AAAG C AC C T ACT C C T GT AAC AAAT T C AGGAAG C AAT AAT C AAG A 
G AAAAT AG C AAC G C AAGG AAAT TAT AC AT T T T C AC AT AAAGT AG AAGT AA 
AAAAT G AAG c T AAGGT AG CG AGT C C AACT C AAT T T AC AT T G GAC AAAG GA 
GAC AG AAT T T T T T AC GAC C AAAT AC T AACT AT T GAAGG AAAT C AG T G GT T 
ATCTTATAAATCATTCAATGGTGTTCGTCGTTTTGTTtTGCTAGGTAAAG 
CAT CT T C AGT AG AAAAAACT G AAG AT AAAG AAAAAGT GT C T C C T C AAC C A 
C AAG C C C GT AT T AC T AAAAC T G GT AG AC T G ACT AT T T C T AAC G AAAC AAC 
TACAGGTTTTGATATTTTAATTACGAATATTAAAGATGATAACGGTATCG 
CTGCTGTTAAGGTACCGGTTTGGACTGAACAAGGAGGGCAAGATGATATT 
AAAT G GT AT AC AG C T GT AAC T AC T G GGG ATGGC AAC T AC AAAG T AG CT GT 
AT CAT T T GC T GAC CAT AAG AAT G AG AAGGGT C T T TAT AAT AT T CAT T TAT 
AC T AC C AAG AAG C T AGT G G GAC AC T T GT AG G T G T AAC AG GAAC T AAAGT G 
AC AGT AG C T GG AAC T AAT T CT T C T C AAG AAC CT AT T G AAAAT G GT T TAG C 
AAAG AC T GGT G T T TAT AAT AT TAT CGG AAGT AC T G AAG T AAAAAAT G AAG 
C T AAAAT AT C AAG T C AG AC C C AAT T T AC T T T AG AAAAAG GT GAC AAAAT A 
AAT TAT GAT C AAGT AT T G AC AGC AGAT GGT T AC C AGT GGAT T T C T T AC AA 
AT C T TAT AG TGGTGTTCGT CG CT AT AT T C CT GT GAAAAAG C T AAC T AC AA 
GTAGTGAAAAAGCGAAaGATGAGGCGACTAAACCGACTAGTTATCCCAAC 
TTACCTAAAACAGGTACCTATACATTTACTAAAACTGTAGATGTGAAGAG 
TCAACCT AAAGT AT C AAGT CC AGT GGAATTT AAT TTTCAAAAGGGTGAAA 
AAAT AC AT TAT GAT C AAG TG T T AGT AGT AG AT GGT CAT C AG T G GAT T T C A 
T AC AAG AGT TAT T C C GGT AT T C G T C G CT AT AT T GAAAT T 

SEQ ID NO. 8903 

STRAIN A909 

AAAAAAGGAC AAG T AAAT GAT ACT AAG C AAT CT T AC 

T C T C T AC G T AAAT AT AAAT T T G G T T TAG CAT C AGT AAT T T T AG GGT CAT T 
CATAATGGTCACAAGTCCTGTTTTTGCGGATCAAACTACATCGGTTCAAG 
T T AAT AAT C AG AC AG G C AC TAG T GT G GAT G CT AAT AAT T C T T C C AAT GAG 
AC AAGT G C G T C AAG T GT GAT T AC T T C C AAT AAT GAT AGT GT T C AAG C GT C 
T GAT AAAGT T GT AAAT AGT C AAAAT AC G GC AAC AAAGG AC AT T AC T AC T C 
C T T T AGT AG AG AC AAAG C C AAT GGT GG AAAAAAC AT T AC C T GAAC AAG GG 
AAT TAT GT T TAT AG C AAAG AAAC C G AGGT G AAAAAT AC AC CT T C AAAAT C 
AGC C C C AGT AG C T T T C T AT G C AAAGAAAGG T GAT AAAGT T T T CT AT GAC C 
AAGT AT T T AAT AAAGAT AAT G T GAAAT GGAT T T CAT AT AAGT CTTTTTGT 
GGCGTACGTCGATACGCAGCTATTGAGTCACTAGATCCATCAGGAGGTTC 
AGAGAC T AAAG C AC C T AC T C C T GT AAC AAAT T C AGG AAG C AAT AAT C AAG 
AG AAAAT AGC AAC G C AAGG AAAT TAT AC AT T T T C AC AT AAAG TAG AAGT A 
AAAAAT G AAG C T AAG G T AG CG AGT C C AACT C AAT T T AC AT T GG AC AAAG G 
AG AC AG AAT T T T T T AC GAC C AAAT AC T AAC TAT T G AAG GAAAT C AGT GGT 
TATCTTATAAATCATTCAATGGTGTTCGTCGTTTTGTTtTGCTAGGTAAA 
G CAT CT T C AGT AG AAAAAACT G AAG AT AAAG AAAAAGT GT C T C C T C AAC C 
AC AAG C C C G TAT T AC T AAAAC T GGT AG AC T GAC TAT T T C T AAC G AAAC AA 
CTACAGGTTTTGATATTTTAATTACGAATATTAAAGATGATAACGGTATC 
GCTGCTGT T AAG GTACCGGTTTG GAC T GAACAAG GAG G G C AAG AT GAT AT 
TAAATGGTATACAGCTGT AACT ACTGGGGATGGC AACT AC AAAGT AGCTG 
TAT CAT T T G C T GAC CAT AAG AAT GAG AAG G GT C T T TAT AAT AT T CAT T T A 
T AC T AC C AAG AAG CT AG T GG G AC AC T T GT AG G T G T AAC AG GAAC T AAAGT 
GACAGTAGCTGGAACTAATTCTTCTCAAGAACCTATTGAAAATGGTTTAG 
C AAAG AC T GGT G T T TAT AAT AT TAT C GG AAG T AC T G AAGT AAAAAAT G AA 
G C T AAAAT AT C AAG T C AG AC C C AAT T T AC T T TAG AAAAAG GT GAC AAAAT 
AAAT TAT GAT C AAGT AT T GAC AG C AG AT G GT T AC C AGT GGAT T T C T T AC A 



391 



WO 2004/018646 



PCT/US2003/026827 



SEQUENCE LISTING 



AAT CT T AT AGT GG T GTTCGTCGC TAT AT T C C T GT G AAAAAG CT AAC TAG A 
AGT AGT GAAAAAG CG AAAGAT GAGGC GAG T AAAC C G AC T AGT T AT C C C AA 
C T TAG C T AAAAC AG GT AC C TAT AC AT T T ACT AAAACT GT AGAT GT G AAG A 
GTCAACCTAAAGTATCAAGTCCAGTGGAATTTAATTTTCAAAAGGGTGAA 
AAAAT AC AT TAT GAT C AAGT GT T AGT AGT AGAT GGT CAT C AGT G GAT T T C 
AT AC AAG AGT TAT T C C GGT AT T CGT CG C TAT AT T GAAAT T 

SEQ ID NO. 8904 

STRAIN H3 6B 

AAAAAAGG AC AAGT AAAT GAT AC T AAGC AAT C T T AC T 
C T C T AC GT AAAT AT AAAT T T GGT T TAG CAT C AG T AAT T T TAG GGT C ATT C 
AT AAT GGT C AC AAGT CCTGTTTTT GC GGAT C AAAC T AC AT C GG T T C AAGT 
T AAT AAT C AG AC AG G C AC T AGT GT GGAT GAT AAT AAT T C T T C C AAT GAGA 
C AAGT G CGT CAAG T GT GAT T ACT T C C AAT AAT G AT AGT GT T C AAG CGT C T 
GAT AAAG T T G T AAAT AGT C AAAAT AC G G C AAC AAAGG AC AT TACT ACT C C 
T T T AGT AGAGAC AAAGC C AAT GGT GG AAAAAAC AT T AC C T GAAC AAG GG A 
AT T AT GT T TAT AG C AAAGAAAC C GAGGT G AAAAAT AC AC CT T C AAAAT CA 
G C C C C AGT AGC T T T CT AT GC AAAG AAAGGT GAT AAAGT T T T CT AT G AC C A 
AGT AT T T AAT AAAG AT AAT G T GAAAT GGAT T T CAT AT AAG TCTTTTTGTG 
GCGTACGTCGATACGCAGCT ATT GAGTCACT AGAT CCATCAGGAGGTTCA 
GAGAC T AAAG C AC C T AC T C C T GT AAC AAAT T CAGG AAGC AAT AAT CAAG A 
G AAAAT AG C AAC GCAAGG AAAT TAT AC AT T T T C AC AT AAAGT AGAAG T AA 
AAAAT GAAG C T AAG G T AGCGAGT C C AAC T C AAT T TAG AT T GG AC AAAG G A 
G AC AG AAT T T T T T ACG AC C AAAT ACT AAC TAT T G AAGG AAAT C AGT G GT T 
ATCTTATAAATCATTCAATGGTGTTCGTCGTTTTGTTtTGCTAGGTAAAG 
CAT C T T C AGT AG AAAAAACT G AAG AT AAAGAAAAAGT G T C T C C T C AAC C A 
C AAGC C C GT AT T AC T AAAAC T GGT AGAC T G AC TAT T T C T AAC G AAAC AAC 
T AC AGGT T T T GAT AT T T T AAT T AC G AAT AT T AAAG AT GAT AAC G GT AT C G 
CTGCTGTTAAGGTACCGGTTTGGACTGAACAAGGAGGGCAAGATGATATT 
AAAT GGT AT AC AG C T G T AAC TAG T GG G GAT G G C AAC T AC AAAGT AG CT GT 
AT CAT T T G C T G AC CAT AAG AAT GAG AAG GGT C T T TAT AAT AT T CAT T TAT 
AC T AC CAAG AAG CT AGT G G G AC AC T T GT AG G T GT AAC AgG AAC T AAAGT G 
ACAGTAGCTGGAACTAATTCTTCTCAAGAACCTATTGAAAATGGTTTAGC 
AAAG AC T GGT GT T TAT AAT AT TAT C GG AAGT AC T G AAGT AAAAAAT GAAG 
CT AAAAT AT C AAGT C AGAC C C AAT T T ACT T T AG AAAAAG G T G AC AAAAT A 
AAT TAT GAT C AAGT AT T G AC AG C AG AT GGT T AC C AGT GGAT T T C T T AC AA 
ATCTTATAGTGGTGTTCGTCGCTATATTCCTGTGAAAAAGCTAACTACAA 
G TAG T G AAAAAG CG AAAG AT GAG G C G AC T AAAC CG AC TAG T TAT C C C AAC 
T T AC C T AAAAC AGGT AC C TAT AC AT T TACT AAAACT GT AG AT GT GAAG AG 
T C AAC C T AAAGT AT C AAGT C C AGT G G AAT T T AAT T T T C AAAAGGGT G AAA 
AAAT AC AT TAT GAT C AAGT GT T AGT AG TAG AT GGT CAT C AG T GGAT T T C A 
T AC AAGAGT T AT T C CG G T AT T CGT C G C TAT AT T GAAAT T 

SEQ ID NO. 8905 

STRAIN 18RS21 

AAAAAAG G AC AAG T AAAT GAT AC T AAG CAAT CT T AC T C 
T C T ACG T AAAT AT AAAT T T GGT T TAG CAT C AG T AAT T T T AGGG T CAT T C A 
TAATGGTCACAAGTCCTGTTTTTGCGGATCAAACTACATCGGTTCAAGTT 
AAT AAT C AG AC AGG C AC TAG T GT G GAT G C T AAT AAT T C T T C CAAT GAGAC 
AAG T G C GT C AAGT G T GAT T AC T T C CAAT AAT GAT AG T GT T CAAG CGT C T G 
AT AAAGTTGT AAAT AGT C AAAAT ACGGCAACAAAGGACATT ACT ACT CCT 
T T AGT AGAGAC AAAG C C AAT GG T G G AAAAAAC AT TAG CT GAAC AAG G G AA 
T TAT GT T T AT AG C AAAG AAAC C G AGG T G AAAAAT AC AC C T T C AAAAT C AG 
CCCCAGTAGCTTTCTATGCAAAGAAAGGTGATAAAGTTTTCTATGACCAA 
GT AT T T AAT AAAG AT AAT GT GAAAT G GAT T T CAT AT AAG TCTTTTTGTGG 
CGTACGTC GAT AC G C AG C T AT T G AGT C ACT AG AT CC AT C AG GAGGT T C AG 
AG ACT AAAG C AC C T AC T CCT G T AAC AAAT T C AG GAAG CAAT AAT C AAGAG 
AAAAT AG C AAC G CAAG GAAAT TAT AC AT T T T C AC AT AAAGT AG AAGT AAA 
AAAT GAAG c T AAGGT AG C GAG T C C AAC T CAAT T T AC AT T G G AC AAAG GAG 
AC AG AAT T T T T T AC G AC C AAAT AC T AAC TAT T GAAG GAAAT C AG T GGT T A 
TCTTATAAATCATTCAATGGTGTTCGTCGTTTTGTTTTGCTAGGTAAAGC 
AT C T T C AG TAG AAAAAAC T GAAG AT AAAG AAAAAG TGTCTCCT C AAC C AC 
AAG C C C GT AT T AC T AAAAC T G G T AG AC T G ACT AT T T C T AAC G AAAC AACT 
AC AG GT T T T GAT AT T T T AAT T ACG AAT AT T AAAG AT GAT AAC G GT AT C GC 
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T G C T GT T AAGG T AC C GG T T T GG AC T GAAC AAGGAGGG C AAG AT GAT AT T A 
AATGGTATACAGCTGTAACTACTGGGGATGGCAACTACAAAGTAGCTGTA 
T CAT T T G C T GAC C AT AAGAAT GAG AAGG GT C T T T AT AAT AT T CAT T TATA 
C TAG C AAG AAGC T AGT G GGAC ACT T GT AGGT GT AAC AG GAACT AAAGT G A 
C AGT AG CT G GAACT AAT T CT T C T C AAGAAC C TAT T GAAAAT G G T T TAG C A 
AAG AC T GG T GT T T AT AAT AT TAT C GG AAGT AC T GAAGT AAAAAAT G AAGC 
T AAAAT AT C AAG T C AGAC C C AAT T T ACT T T AG AAAAAG G T G AC AAAAT AA 
AT TAT GAT C AAGT AT T G AC AGC AGAT GGT T AC C AGT G GAT T T C T T AC AAA 
T CT T ATAGTGGT GTT CGT CGCT AT ATT CCTGTGAAAAAGCTAACT ACAAG 
T AGT GAAAAAG C GAAAGAT GAG G CG AC T AAAC C GAC TAG T TAT C C C AAC T 
T AC C T AAAAC AG G T AC C TAT AC AT T T AC T AAAACT G T AGAT G T GAAAAGT 
C AAC C T AAAG T AT C AAG T C C AG T GGAAT T T AAT T T T C AAAAGGGT G AAAA 
AAT AC AT TAT GAT C AAGT GT T AGT AGT AGAT GGT CAT C AGT G GAT T T CAT 
ACAAGAGTTAiTTCCGGTATTCGTCGCTATATTGAAATT 

SEQ ID NO. 8906 

STRAIN M732 

C AAGT AAAT GAT a C T AAG C AAT C T T AC T CT C T AC GT AAAT AT AAAT T T G G 
T T TAG CAT C AGT AAT T T T AGGGT CAT T C AT AAT GG T C ACAAG TCCTGTTT 
T T G C G GAT C AAAc T AC AT C G G T T C AAG T T AAT AAT C AG AC AG G C AC TAG T 
GT G GAT G C T AAT AAT T C T T C C AAT GAGACAAGT G CGT C AAGT G T G AT T AC 
T T C C AAT AAT G AT AGT GT T C AAG C G T CT GAT AAAGT T GT AAAT AGT C AAA 
AT AC GG C AAC AAAG GAC AT T AC T AC T C C T T TAG T AG AG AC AAAGC C AAT G 
GTGGAAAAAACATTACCTGAACAAGGGAATTATGTTTATAGCAAAGAAAC 
C GAGGT G AAAAAT AC AC C T T C AAAAT C AG C C C C AGT AG C T T T CT AT GC AA 
AG AAAG GT GAT AAAG T T T T CT AT G AC C AAGT AT T T AAT AAAGAT AAT GT G 
AAAT G GAT T T CAT AT AAGT CTTTTGGTGGCG T AC G T CG AT ACGC AG C TAT 
T GAG T C AC TAG AT C C AT C AGGAG GTT C AG AG AC T AAAG C AC CT ACT C C T G 
T AAC AAAT T CAGGAAGC AAT AAT C AAG AG AAAAT AGC AAC GCAAGGAAAT 
TAT AC AT T T T C AC AT AAAGT AGAAGT AAAAAAT GAAG C T AAGG T AGC GAG 
T C C AAC T C AAT T T AC AT T G G AC AAAGG AG AC AGAAT T T T T T AC GAC C AAA 
T AC T AAC T a t T GAAG G AAAT C AGT G G T TAT C T TAT AAAT CAT T C AAT GGT 
GTT CGT CGT TTTGtTttGcT AGGT AAAGC ATCTTC AGT AGAAAAAACTGA 
AG AT AAAG AAAAAGT G T C T C C T C AAC C AC AAG C C C GT AT T AC T AAAAC T G 
G T AGAC T GAC TAT T T C T AAC G AAAC AAC T AC AG GT T T T GAT AT T T T AAT T 
ACGAATATTAAAGATGATAACGGTATCGCTGCTGTTAAGGTACCGGTTTG 
G AC T GAAC AAG GAGGG C AAG AT GAT AT T AAAT G GT AT AC AG CT GT AAC T A 
C T G G GG AT G G C AAC T AC AAAGT AG C T GT AT CAT T T G C T GAC CAT AAG AAT 
GAG AAG G GT C T T TAT AAT AT T CAT T TAT AC T AC C AAGAAGC T AGT GGG AC 
AC T T GT AGGT GT AAC AG GAACT AAAGT GAC AGT AG CT GG AAC T AAT T C T T 
C T C AAG AAC C TAT T GAAAAT GGT T T AC C AAAG ACT G GT GT T TAT AAT AT T 
AT CG GAAGT ACT GAAGT AAAAAATGAAGCT AAAAT AT CAAGTC AG ACCC A 
AT T T ACT T TAG AAAAAG G T GAC AAAAT AAAT TAT GAT C AAG TAT T GAC AG 

CAGATGGTTACCAGTGGATTTCTTACAAATCTTATAGTGGTGTTCGTCGC 
TAT AT T C C T G T GAAAAAG C T AAC T AC AAGT AG T GAAAAAG C GAAAGAT G A 
G G C G AC T AAAC C G AC T AGT TAT C C C AACT T AC C T AAAAC AGGT AC C T AT A 
CAT T TACT AAAAC T G T AG AT GT GAAAAGT C AAC CT AAAGT AT C AAGT C C A 
GT G G AAT T T AAT T T T C AAAAGGGT GAAAAAAT AC AT TAT GAT C AAG T GT T 
AG T AGT AG AT GGT CAT C AGT G GAT T T CAT AC AAG AG T TAT T C C G G TAT T C 
GT C G C TAT AT T G AAAT T 

SEQ ID NO. 8907 

STRAIN COH1 

AAAAAAG GAC AAG T AAAT GAT ACT AAG C AAT CTTACTCTCT 

ACGT AAAT AT AAAT T TGGTTT AGC AT C AGT AATTTT AGGGT C ATT C AT AA 

TGGTCACAAGTCCTGTTTTTGCGGATCAAACTACATCGGTTCAAGTTAAT 

AAT C AGAC AG G C AC TAG T G T GG AT G C T AAT AAT T C T T C C AAT GAG AC AAG 

TGCGTCAAGTGTGATTACTTCCAATAATGATAGTGTTCAAGCGTCTGATA 
AAGTTGTAAATAGTCAAAATACGGCAACAAAGGACATTACTACTCCTTTA 
GT AGAG AC AAAG C C AAT GG T G G AAAAAAC AT T AC C T GAAC AAG G G AAT T A 

TGTTTATAGCAAAGAAACCGAGGTGAAAAATACACCTTCAAAATCAGCCC 
C AGT AG C T T T C TAT G C AAAG AAAG GT GAT AAAG T T T T C T AT GAC C AAG T A 

TTTAAT AAAGAT AATGTTAAATGGATTTCATATAAGTCTTTTGGTGGCGT 
AC G T C GAT AC G C AG CT AT T GAG T C AC TAG AT C CAT C AG GAGGT T C AG AG A 
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CTAAAGCACCTACTCCTGTAACAAATTCAGGAAGCAATAATCAAGAGAAA 
AT AGCAACGCAAGGAAATTAT ACATT T T CACAT AAAGT AGAAGTAaAAAA 
T GAAG c T AAG GT AG CG AGT C C AAC T C AAT T TAG AT T G G AC AAAG GAG AC A 
GAATTTTTTACGACCAAATACTAACTATTGAAGGAAATCAGTGGTTATCT 
TATAAATCATTCAATGGTGTTCGTCGTTTTGTTtTGCTAGGTAAAGCATC 
T T C AGT AGAAAAAACT GAAGAT AAAGAAAAAG TGTCTCCT C AAC C AC AAG 
C C CGT AT T AC T AAAACT G GT AG ACT GAC TAT T T CT AAC GAAAC AAC T AC A 
G GT T T T GAT AT T T T AAT T AC GAAT AT T AAAGAT GAT AACG GT AT CG CTGC 
T GT T AAGGT AC C G GT T T G GAC T G AAC AAGG AG GG CAAGAT GAT AT T AAAT 
GGT AT AC AG CT GT AAC T ACT G GGGAT GG C AAC T AC AAAGT AG C T GT AT C A 
T T T G C T GAC CAT AAG AAT GAGAAGG GT C T T TAT AAT AT T CAT T TAT AC T A 
C C AAG AAG C T AGT G GGAC AC T T GT AGGT GT AAC AG GAAC T AAAG T GAC AG 
T AG CT GGAAC T AAT T CT T C T C AAG AAC C TAT T G AAAAT GGT T TAG C AAAG 
AC T GG T GT T T AT AAT AT T AT C GG AAGT ACT G AAGT AAAAAAT GAAG C T AA 
AAT AT C AAG T C AG AC C C AAT T T ACT T TAG AAAAAGGT GAC AAAAT AAAT T 
AT GAT C AAGT AT T GAC AG C AGAT GG T TAG C AG T GG AT T T C T T AC AAAT C T 
T AT AGT G GT GTTCGTCGC TAT AT T C C T GT G AAAAAG CT AAC T AC AAGT AG 
T G AAAAAG C G AAAG AT GAGG C G AC T AAAC C G AC T AGT TAT C C C AAC T T AC 
C T AAAAC AG GT AC CT AT ACAT T T AC T AAAAC T GT AGAT G T G AAAAGT C AA 
CC T AAAGT AT C AAGT C C AGT GG AAT T T AAT T T T C AAAAGGG T GAAAAAAT 
ACATTATGATCAAGTGTTAGTAGTAGATGGTCATCAGTGGATTTCATACA 
AG AGT TAT T C C GGT AT T C GT C G CT AT AT T G AAAT T 

SEQ ID NO. 8908 

STRAIN M781 

AAAAAAG GAC AAGT AAAT GAT AC T AAG C AAT C T T 

ACT C T C T AC GT AAAT AT AAAT T T GGT T T AGC AT C AG T AAT T T T AG GGT C A 
TT CAT AATGGTCACAAGTCCTGTTTTTGCGGATCAAACT ACAT CGGTTCA 
AGT T AAT AAT C AGAC AGG C AC T AGT GT G GAT G C T AAT AAT T C T T C C AAT G 
AG AC AAGT G C GT C AAG T GT G AT T AC T T C C AAT AAT GAT AG T GT T C AAG CG 
TCTGAT AAAGTTGTAAAT AGT CAAAAT ACGGCAAC AAAGGACAT TACT AC 
T C CT T T AGT AGAG AC AAAG C C AAT G GT G G AAAAAAC AT T AC CT GAAC AAG 
GG AAT TAT GT T TAT AG C AAAG AAAC C GAG GT G AAAAAT AC AC C T T C AAAA 
T C AG C C C C AGT AGC T T T CT AT G C AAAG AAAG GT GAT AAAG T T T T C T AT GA 
C C AAGT AT T T AAT AAAG AT AAT G T G AAAT GG AT T T CAT AT AAGT CT T T T G 
GT GG C GT ACGT C GAT AC G C AG C TAT T G AG T C AC TAG AT C CAT C AG G AG GT 
T C AG AGAC T AAAG C AC CT AC T C C T GT AAC AAAT T C AGG AAG C AAT AAT C A 
AGAG AAAAT AG C AAC G C AAGGAAAT TAT AC AT T T T CACAT AAAGT AG AAG 
T AAAAAAT G AAG CT AAGG TAG C G AGT C C AAC T C AAT T T AC AT T GGAC AAA 
G GAG AC AGAAT T T T T TAG GAC C AAAT AC T AACT AT T GAAGG AAAT C AGT G 
GTTATCTTATAAATCATTCAATGGTGTTCGTCGTTTTGTTtTGCTAGGTA 
AAG CAT C T T C AGT AG AAAAAACT GAAGAT AAAGAAAAAGT GT C T C C T C AA 
C C AC AAG C C C GT AT TACT AAAAC T G G TAG AC T GAC TAT T T C T AAC GAAAC 
AAC T AC AG GT T T T GAT AT T T T AAT T AC GAAT AT T AAAGAT GAT AAC G GT A 
TCGCTGCTGT T AAg gT AC CGG T T T G GAC T GAAC AAG GAG GG CAAGAT GAT 
ATT AAAT GGT AT AC AGCTGT AAC TACTGGGGATGGC AACT AC AAAGT AGC 
T GT AT CAT T T G C T GAC CAT AAG AAT GAG AAGG G T C T T TAT AAT AT T CAT T 
T AT ACT AC C AAG AAG CT AGT GG G AC AC T TGT AGG TGT AAC AGG AACT AAA 
GTGACAGTAGCTGGAACTAATTCTTCTCAAGAACCTATTGAAAATGGTTT 
AC C AAAG ACT GGT GTTT AT AAT AT TAT C GG AAGT ACT GAAGT AAAAAAT G 
AAG CT AAAAT AT C AAGT C AG AC C C AAT T T AC T T T AGAAAAAGGT G AC AAA 
AT AAAT TAT GAT C AAGT AT T GAC AG C AG AT GGT T AC C AG T G GAT T T C T T A 
CAAATCTTATAGTGGTGTTCGTCGCTATATTCCTGTGAAAAAGCTAACTA 
C AAGT AGT G AAAAAG C G AAAG AT GAGG C GAC T AAAC C GAC T AG T TAT C C C 
AACT T AC C T AAAAC AG GT AC C TAT AC AT T T AC T AAAAC T GT AG AT G T G AA 
AAGT C AAC CT AAAGT AT C AAGT C C AGTGG AATT T AAT T TT C AAAAGGGT G 
AAAAAAT AC AT TAT GAT C AAG TGT TAG T AGT AG AT GGT CAT C AG T G GAT T 
TCATACAAGAGTTATTCCGGTATTCGTCGCTATATTGAAATT 

SEQ ID NO. 8909 

STRAIN CJB110 

AAAAAAGG AC AAG T AAAT GAT AC T AAG C AAT CT T AC T C T C 

TACGTAAATATAAATTTGGTTTAGCATCAGTAATTTTAGGGTCATTCATA 

ATGGTCACAAGTCCTGTTTTTGCGGATCAAACTACATCGGTTCAAGTTAA 
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T AAT C AGAC AG G C ACT AGT GT GG AT GCT AAT AAT T C T T C C AAT G AG AC AA 
GT G C GT C AAGT GT G AT T ACT T C C AAT AAT G AT AGT GT T C AAG C G T C T GAT 
AAAGTTGTAAATAGTCAAAATACGGCAACAAAGGACATTACTACTCCTTT 
AGT AG AGAC AAAG C C AAT GGT GGAAAAAAC AT T AC C T GAAC AAG GG AAT T 
AT GT T TAT AG C AAAGAAAC C GAGGT G AAAAAT AC AC C T T C AAAAT CAGC C 
C CAGT AG C T T T C TAT G C AAAGAAAGGT GAT AAAGT T T T C TAT GAC C AAGT 
AT T T AAT AAAGAT AAT GT GAAAT G GAT T T CAT AT AAG TCTTTTTGTGGCG 
TAG G T CG AT AC GC AGC T AT T GAGT CAC T AGAT C CAT C AGGAGGT T C AGAG 
ACTAAAGCACCTACTCCTGTAACAAATTCAGGAAGCAATAATCAAGAGAA 
AATAGCAACGCAAGGAAATTATACATTTTCACATAAAGTAGAAGTAAAAA 
AT GAAG C T AAG GT AG CGAG T C C AAC T C AAT T T AC AT T G GAC AAAGG AG AC 
AGAAT T T T T T AC GAC C AAAT AC T AAC T AT T G AAG GAAAT C AG T G G T TAT C 
TTATAAATCATTCAATGGTGTTCGTCGTTTTGTTTTGCTAGGTAAAGCAT 
C T T C AGT AG AAAAAAC T G AAGAT AAAG AAAAAGT G T C T C CT C AAC C AC AA 
G C C C GT AT T AC T AAAACT GG T AG AC T GAC TAT T T C T AAC G AAAC AAC T AC 
AGGTTTTGATATTTTAATTACGAATATTAAAGATGATAACGGTATCGCTG 
C T GT T AAGG T AC C GGT T T GGAC T GAAC AAG GAG G G C AAGAT GAT AT T AAA 
T GGT AT AC AG C T GT AAC T AC T GG GG AT GG C AAC T AC AAAG TAG CT GT AT C 
AT T T G C T GAC CAT AAG AAT GAG AAG G GT C T T TAT AAT AT T CAT T TAT ACT 
AC C AAG AAG CT AGT G G GAC AC T T GT AG GT G T AAC AGGAAC T AAAGT GAC A 
GT AG CT G GAAC T AAT T C T T C T C AAG AAC C TAT T G AAAAT GGT T T AG C AAA 
G ACT G GT GT T TAT AAT AT TAT C G GAAGT ACT G AAGT AAAAAAT GAAG C T A 
AAAT AT C AAG T C AG AC C C AAT T T ACT T TAGAAAAAGGT GAC AAAAT AAAT 
TAT GAT C AAGT AT T GAC AGC AG AT G GT T AC CAGT G GAT T T C T T AC AAAT C 
T TAT AGT GGT GT T CGT C G CT AT AT T C C T GT G AAAAAG C T AAC T AC AAG T A 
G T G AAAAAG C G AAAG AT GAG G C G AC T AAAC C GAC TAG T TAT C C C AAC T T A 
C C T AAAAC AG GT AC CT AT AC AT T T AC T AAAAC T G T AGAT GT GAAG AGT C A 
AC C T AAAGT AT C AAG T C C AGT G G AAT T T AAT T T T C AAAAGGGT G AAAAAA 
T AC AT TAT GAT C AAGT GT T AGT AGT AGAT GGT CAT C AGT GG AT T T CAT AC 
AAGAGTTATTCCGGTATTCGTCGCTATATTGAAATT 

SEQ ID NO. 8910 

STRAIN 1169NT 

AAAAAAGGACAAGTAAATGATACTAAGCAATCTTACTC 
T CT AC GT AAAT AT AAAT T T GG T T T AGC AT CAGT AAT T T T AGGGT CAT T C A 
T AAT GGT CAC AAGT CCTGTTTTT G CG G AT C AAAC T AC AT C G G T T C AAG T T 
AAT AAT C AG AC AGG C AC T AGT GT G GAT GC T AAT AAT T C T T C C AAT GAG AC 
AAGT G CGT C AAGT GT G AT T AC T T C C AAT AAT GAT AGT GT T C AAG C G T C T G 
AT AAAGT T GT AAAT AG T C AAAAT AC G GC AAC AAAG GAC AT TACT AC T C C T 
T T AGT AG AG AC AAAGC C AAT G G T G GAAAAAAC AT T AC C T GAAC AAG G G AA 
T TAT G T T TAT AGC AAAG AAAC C GAG G T G AAAAAT AC AC C T T C AAAAT CAG 
C C C CAGT AG C T T T C TAT G C AAAG AAAGGT GAT AAAG T T T T CT AT GAC C AA 
G T AT T T AAT AAAG AT AAT GT GAAAT GG AT T T CAT AT AAGT CTTTTGGTGG 
CGTACGTC GAT AC G C AG CT AT T GAGT CAC TAG AT C CAT CAG G AG GT T CAG 
AG AC T AAAG C AC C T AC T C CT G T AAC AAAT T C AGG AAG C AAT AAT C AAG AG 
AAAAT AG C AAC G C AAG GAAAT TAT AC AT T T T CAC AT AAAG TAG AAGT AAA 
AAAT GAAG C T AAG GT AG CGAG T C C AACT C AAT T T AC AT T GGAC AAAG GAG 
AC AGAAT T T T T T ACG AC C AAAT AC T AACT AT T GAAG GAAAT C AGT G G T T A 
TCT TAT AAAT CATTCAATGGTGTT CGT CGTTTTGTTTTGCTAGGT AAAGC 
AT C T T C AGT AGAAAAAAC T GAAG AT AAAG AAAAAGT G T C T C C T C AAC CAC 

AAGCC CGT ATT ACT AAAACT GGT AG ACT GACT AT TTCTAACG AAAC AACT 
AC AGGT T T T GAT AT T T T AAT T AC GAAT AT T AAAG AT GAT AAC GGT AT C G C 
T G C T G T T AAG GT AC CGGT T T G GACT GAAC AAG G AGG G C AAGAT GAT AT T A 
AATGGTATACAGCTGTAACTACTGGGGATGGCAACTACAAAGTAGCTGTA 
T CAT T T G C T GAC CAT AAG AAT G AG AAG GG T C T T TAT AAT AT T CAT T TATA 
C T AC C AAG AAG C TAG T G G GAC AC T T GT AGGT G T AAC AG G AAC T AAAG T G A 
CAGTAGCTGGAaCTAATTCTTCTCAAGAACCTATTGAAAATGGTTTAGCA 
AAG AC T G GT GT T TAT AAT AT TAT C GG AAGT AC T GAAGT AAAAAAT GAAG C 
T AAAAT AT C AAG T CAG AC C C AAT T T AC T T TAG AAAAAG G T GAC AAAAT AA 
AT TAT GAT C AAG TAT T GAC AG CAG AT GGT T AC CAG T GG AT T T C T T AC AAA 
TCTTATAGTGGTGTTCGTCGCTATATTCCTGTGAAAAAGCTAACTACAAG 
T AGT G AAAAAG C G AAAG AT G AGG C G AC T AAAC C G AC T AGT TAT C C C AAC T 
T AC C T AAAAC AG GT AC C TAT AC AT T T AC T AAAACT G T AG AT GT G AAAAG T 
C AAC C T AAAG TAT C AAGT C CAGT GG AAT T T AAT T T T CAAAAGG GT G AAAA 
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AAT AC AT TAT GAT C AAGT G T T AGT AGT AGAT GGT CAT C AGT G GAT T T CAT 
ACAAGAGTTATTCCGGTATTCGTCGCTATATTGAAATT 

SEQ ID NO. 8911 

STRAIN JM9130013 

AAAAAAG GAC AAG T AAAT GAT ACT AAG C AAT C T T AC T 

CTCTACGTAAATATAAATTTGGTTTAGCATCAGTAATTTTAGGGTCATTC 
AT AAT GGT C AC AAGT C CT GT T T T T GC GG AT C AAACT AC AT CG GT T C AAG T 
T AAT AAT C AGAC AGG C AC TAG T GT GG AT G CT AAT AAT T CTT C C AAT GAGA 
CAAGTGCGTCAAGTGTGATTACTTCCAATAATGATAGTGTTCAAGCGTCT 
GAT AAAGT T G T AAAT AGT C AAAAT AC G G C AAC AAAGG AC AT T AC T AC T C C 
T T TAG TAG AGAC AAAG C C AAT GGT GGAAAAAAC AT T AC C T GAAC AAGGGA 
AT T AT GT T TAT AG C AAAG AAAC C GAG G T G AAAAAT AC AC CTT C AAAAT C A 
G C C C C AG T AG CT T T C TAT G C AAAG AAAG G T GAT AAAGT T T T CT AT GAC C A 
AG TAT T T AAT AAAGAT AAT GT G AAAT GG AT T T CAT AT AAG TCTTTTTGTG 
GC G T ACGT C GAT ACG C AG C TAT T GAG T C AC T AGAT C CAT C AG GAGG T T C A 
GAG ACT AAAG C AC C T AC T C C T G T AAC AAAT T C AG GAAG C AAT AAT C AAG A 
GAAAATAGCAACGCAAGGAAATTATACATTTTCACATAAAGTAGAAGTAA 
AAAAT GAAG C T AAGGT AG C GAGT C C AACT C AAT T T AC AT T GGAC AAAG G A 
GAC AG AAT T T T T T AC GAC C AAAT AC T AACT AT T G AAGG AAAT CAGT G G T T 
ATCTTATAAATCATTCAATGGTGTTCGTCGTTTTGTTTTGCTAGGTAAAG 
CAT CTT C AG T AG AAAAAAC T GAAG AT AAAG AAAAAG TGTCTCCT C AAC C A 
C AAG C C C GT AT T AC T AAAACT GGT AG AC T G ACT AT T TAT AAC G AAAC AAC 
T AC AGGT T T T GAT AT T T T AAT T AC GAAT AT T AAAG AT GAT AAC G GT AT CG 
CTGCTGTT AAGGT AC CGGTTTGGACTGAACAAGGAGGGCAAGATGAT ATT 
AAAT G G T AT AC AG CT GT AAC T AC T G G GGAT GG C AAC TAG AAAGT AG C T GT 
AT CAT T T G C T GAC CAT AAG AAT G AGAAG GGT C T T TAT AAT AT T CAT T TAT 
AC T AC C AAGAAG C T AGT G G GAC AC T T G T AG GT GT AAC AGG AAC T AAAGT G 
AC AGT AG C T G GAAC T AAT T C T T CT C AAG AAC C T AT T G AAAAT GGT T TAG C 
AAAG AC T GGT GT T TAT AAT AT TAT C G GAAG T AC T GAAGT AAAAAAT GAAG 
C T AAAAT AT C AAG T C AG AC C C AAT T T AC T T T AGAAAAAG GT G AC AAAAT A 
AAT TAT GAT C AAGT AT T GAC AG C AG AT GGT T AC C AG T GGAT T T CT T AC AA 
ATCTTATAGTGGTGTTCGTCGCTATATTCCTGTGAAAAAGCTAACTACAA 
G T AGT G AAAAAG C G AAAG AT GAG G C GAC T AAAC C G AC T AGT TAT C C C AAC 
T T AC CT AAAAC AGGT AC C TAT AC AT T T AC T AAAAC T GT AG AT GT GAAG AG 
T C AAC C T AAAGT AT C AAG T C C AG T GG AAT T T AAT T T T C AAAAG GGT G AAA 
AAAT AC AT TAT GAT C AAGT G T TAG T AGT AGAT GGT CAT C AGT G GAT T T C A 
T AC AAG AGT TAT T C C GG T AT T CGT C G C TAT AT T G AAAT T 

SEQ ID NO. 8912 

STRAIN 2 603 frame: 1 

MKKGQVNDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNS 
SNETSASSVITSNNDSVQASDKWNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKE 
TEVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFCGVRRYAAIESLDPSGGS 
ETKAPT PVTN S GSNNQEKIATQGN YT FSHKVE VKNE AKVAS PTQFTLDKGDRI FYDQ I LT 

IEGNQWLSYKSFNGVRRFVLLGKASSVEKTEDKEKVSPQPQARITKTGRLTISNETTTGF 

DILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYN 

IHLYYQEASGTLVGVTGTKVTVAGTNSSQEPIENGLAKTGVYNIIGSTEVKNEAKISSQT 

QFTLEKGDKINYDQVLTADGYQWISYKSYSGVRRYIPVKKLTTSSEKAKDEATKPTSYPN 

LPKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLVVDGHQWISYKSYSGIRRY 
IEI 

SEQ ID NO. 8913 

STRAIN 090 frame: 1 

KKGQVNDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNSS 
NETSASSVITSNNDSVQASDKVVNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFCGVRRYAAIESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVAS PTQFTLDKGDRI FYDQILT I 
EGNQWLS YKS FNGVRRFVLLGKAS S VEKTE DKEKVS PQPQARITKTGRLT I SNETTTG FD 

ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 

HLYYQEASGTLVGVTGTKVTVAGTNSSQEPIENGLAKTGVYNIIGSTEVKNEAKISSQTQ 

FTLEKGDKINYDQVLTADGYQWISYKSYSGVRRYIPVKKLTTSSEKAKDEATKPTSYPNL 

PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLVVDGHQWISYKSYSGIRRYI 
EI 
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SEQ ID NO. 8914 

STRAIN A90 9 frame: 1 

KKGQVNDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNSS 
NETSASSVITSNNDSVQASDKVVNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVBCWISYKSFCGVRRYAAIESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLS YKS FNGVRRFVLLGKAS S VEKTEDKEKVS PQPQARITKTGRLT I SNETTTGFD 
ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HL Y YQE AS GT LVGVT GTKVT VAGTN S S QE P I ENGLAKTG V YN 1 1 GS TE VKNEAKI S S QT Q 
FTLEKGDKINYDQVLTADGYQWI S YKS YS GVRRYI PVKKLTT S SEKAKDEATKPT S YPNL 
PKT GT YT FTKTVDVKS Q PKVS S PVE FN FQKGEKIHYDQVLWDGHQW I S YKS YS G I RRYI 
EI 

SEQ ID NO. 8915 

STRAIN H3 6B frame: 1 

KKGQWDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQWNQTGTSVDDNNSS 
NETSASSVITSNNDSVQASDKWNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFCGVRRYAAIESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLS YKS FNGVRRFVLLGKAS SVEKTE DKEKVS PQPQARITKTGRLT I SNETTTGFD 
ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HLYYQEASGTLVGVTGTKVTVAGTNSSQEPIENGLAKTGVYNI I GSTE VKNEAKI SSQTQ 
FTLEKGDKINYDQVLTADGYQWI SYKSYSGVRRYIPVKKLTTSSEKAKDEATKPTS YPNL 
PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLWDGHQWISYKSYSGIRRYI 
EI 

SEQ ID NO. 8916 

STRAIN 18RS21 frame: 1 

KKGQVNDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNSS 
NETSASSVITSNNDSVQASDKVVNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFCGVRRYAAIESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLS YKS FNGVRRFVLLGKAS SVEKTE DKEKVS PQPQARITKTGRLT I SNETTTGFD 
ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HLYYQEAS GTLVGVTGTKVT VAGTN S S QE P IENGLAKTGVYNI I GSTEVKNEAKI SSQTQ 
FTLEKGDKINYDQVLTADGYQWI S YKS YSGVRRYIPVKKLTTSSEKAKDEATKPTS YPNL 
PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLWDGHQWISYKSYSGIRRYI 
EI 

SEQ ID NO. 8917 

STRAIN M732 frame: 1 

QVNDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNSSNET 
SASSVITSNNDSVQAS DKWNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKETEVK 
NTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFGGVRRYAAIESLDPSGGSETKA 
PTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTIEGN 
QWLSYKS FNGVRRFVLLGKAS SVEKTE DKEKVS PQPQARITKTGRLT I SNETTTGFDILI 
TNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNIHLY 
YQEASGTLVGVTGTKVTVAGTNSSQEPIENGLPKTGVYNIIGSTEVKNEAKISSQTQFTL 
EKGDKINYDQVLTADGYQWISYKSYSGVRRYIPVKKLTTSSEKAKDEATKPTSYPNLPKT 
GTYT FTKTVDVKS QPKVSS PVE FNFQKGEKIHYDQVLWDGHQWIS YKS YSGIRRYIEI 

SEQ ID NO. 8918 

STRAIN COH1 frame: 1 

KKGQVNDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNSS 
NETSASSVITSNNDSVQAS DKVVNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFGGVRRYAAIESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLS YKS FNGVRRFVLLGKAS SVEKTE DKEKVS PQPQARITKTGRLT I SNETTTGFD 
ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HLYYQEASGTLVGVTGTKVTVAGTNSSQEPIENGLPKTGVYNIIGSTEVKNEAKI SSQTQ 
FTLEKGDKINYDQVLTADGYQWI S YKS YSGVRRYIPVKKLTTSSEKAKDEATKPTS YPNL 
PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLVVDGHQWISYKSYSGIRRYI 
EI 
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SEQ ID NO. 8919 

STRAIN M7 81 frame: 1 

KKGQWDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNSS 
NETSASSVITSNNDSVQASDKVWSQNTATKDITTPLVETKPIWEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFGGVRRYAAIESLDPSGGSE 
TKAPT & VTN S GSNNQEK I ATQGN YT FS HKVE VKNE AKVAS PT QFT LDKG DRI FYDQ I LT I 
EGNQWLSYKSFNGVRRFVLLGKASSVEKTEDKEKVSPQPQARITKTGRLTISNETTTGFD 
ILITNIKDDNGIAAVKVPWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HLYYQEASGTLVGVTGTKVT VAGTNS SQEPIENGLPKTGVYNI IGSTEVKNEAKI S SQTQ 
FTLEKGDKIN YDQVLTADGYQWI S YKS YSGVRRYI PVKKLTT S SEKAKDEATKPTS YPNL 
PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLWDGHQWISYKSYSGIRRYI 
EI 

SEQ ID NO. 8920 

STRAIN CJB110 frame: 1 

KKGQVNDTKQSYSLRKYKFGLASVILGSFIMVTSPVFADQTTSVQVNNQTGTSVDANNSS 
NETSASSVITSNNDSVQASDKWNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNTPSKSAPVAFYAKKGDKVFYDQVFNKDNVKWISYKSFCGVRRYAAIESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLSYKSFNGVRRFVLLGKASSVEKTEDKEKVSPQPQARITKTGRLTISNETTTGFD 
ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HLYYQEASGTLVGVTGTKVTVAGTNSSQEPIENGLAKTGVYNI IGSTEVKNEAKI SSQTQ 
FTLEKGDKINYDQVLTADGYQWISYKSYSGVRRYIPVKKLTTSSEKAKDEATKPTSYPNL 
PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLWDGHQWISYKSYSGIRRYI 
EI 

SEQ ID NO. 8921 

STRAIN 1169NT frame: 1 

KKGQVN DTKQS Y S LRKYKFGLAS VI LGS FIMVT S PVFADQTT S VQVNNQTGT S VDANN S S 
NETSASSVITSNNDSVQAS DKWNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNT P SKS AP VAFYAKKGDKVF YDQVFNKDN VKW I S YKS FGGVRRYAAI E S LD P SGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLSYKSFNGVRRFVLLGKAS SVEKTEDKEKVSPQPQARITKTGRLTISNETTTGFD 
ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HLYYQEASGTLVGVTGTKVTVAGTNSSQEPIENGLAKTGVYN I IGSTEVKNEAKI SSQTQ 
FTLEKGDKINYDQVLTADGYQWISYKSYSGVRRYIPVKKLTTS SEKAKDEATKPTS YPNL 
PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLVVDGHQWISYKSYSGIRRYI 
EI 

SEQ ID NO. 8922 

STRAIN JM9130013 frame: 1 

KKGQVN DTKQS YSLRKYKFGLAS VI LGS FIMVTSPVFADQTTSVQVNNQTGTSVDANNSS 
NETSASSVITSNNDSVQASDKWNSQNTATKDITTPLVETKPMVEKTLPEQGNYVYSKET 
EVKNT PSKSAPVAFYAKKGDKVFYDQVFNKDNVKWI S YKS FCGVRRYAAIESLDPSGGSE 
TKAPTPVTNSGSNNQEKIATQGNYTFSHKVEVKNEAKVASPTQFTLDKGDRIFYDQILTI 
EGNQWLSYKSFNGVRRFVLLGKASSVEKTEDKEKVSPQPQARITKTGRLTIYNETTTGFD 
ILITNIKDDNGIAAVKVPVWTEQGGQDDIKWYTAVTTGDGNYKVAVSFADHKNEKGLYNI 
HL Y YQE AS GT LVGVT GTKVT VAGTN S S QE P I ENGLAKT G VYN 1 1 G S T E VKNE AKI S S QT Q 
FTLEKGDKINYDQVLTADGYQWISYKSYSGVRRYIPVKKLTTS SEKAKDEATKPTS YPNL 
PKTGTYTFTKTVDVKSQPKVSSPVEFNFQKGEKIHYDQVLWDGHQWISYKSYSGIRRYI 
EI 
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SAG0001 


453 


chromosomal replication initiator protein DnaA 


SAG0002 


378 


DNA polymerase III, beta subunit 


SAG0003 


293 


diacylglycerol kinase catalytic domain protein, putative 


SAG0004 


65 


conserved hypothetical protein 


SAG0005 


67 


hypothetical protein 


SAG0006 


371 


GTP-binding protein YchF 


SAG0007 


191 


peptidyl-tRNA hydrolase 


SAG0008 


1165 


transcription-repair coupling factor 


SAG0009 


31 


hypothetical protein 


SAGOO 10 


90 


S4 domain protein 


SAGOO 11 


123 


cell division protein DivIC, putative 


SAG0012 


44 


conserved hypothetical protein 


SAGOO 13 


428 


protein of unknown function 


SAGOO 14 


424 


MesJ/Y cf62 family protein 


SAGOO 15 


180 


hypoxanthine-guaninephosphoribosyltransferase 


SAGOO 16 


658 


cell division protein FtsH 


SAGOO 17 


447 


pcsB protein 


SAGOO 18 


322 


ribose-phosphate pyrophosphokinase 


SAGOO 19 


391 


aminotransferase, class I 


SAG0020 


253 


recombination protein O 


SAG0021 


283 


protease, putative 


SAG0022 


330 


fatty acid/phospholipid synthesis protein PlsX 


SAG0023 


79 


acyl carrier protein 


SAG0024 


234 


phosphoribosylaminoimidazole-succinocarboxamide synthase 


SAG0025 


1241 


phosphoribosylformylglycinamidine synthase, putative 


SAG0026 


484 


amidophosphoribosyltransferase 


SAG0027 


340 


phosphoribosylformylglycinamidine cyclo-ligase 


SAG0028 


182 


phosphoribosylglycinamide formyltransferase 


SAG0029 


250 


acetyltransferase, GNAT family 


SAG0030 


515 


phosphoribosylaminoimidazolecarboxamide 
formyltransferase/IMP cyclohydrolase 


SAG0031 


299 


peptidase, M23/M37 family 


SAG0032 


434 


group B streptococcal surface immunogenic protein 


SAG0033 


232 


N-acetylmannosamine-6-P epimerase, putative 


SAG0034 


438 


sugar ABC transporter, sugar-binding protein 


SAG0035 


295 


sugar ABC transporter, permease protein 


SAG0036 


276 


sugar ABC transporter, permease protein 


SAG0037 


147 


conserved hypothetical protein 


SAG0038 


220 


conserved hypothetical protein 


SAG0039 


305 


N-acetylneuraminate lyase, putative 


SAG0040 


293 


ROK family protein 


SAG0041 


325 


acetyl xylan esterase, putative 


SAG0042 


267 


phosphosugar-binding transcriptional regulator, RpiR family, 
putative 


SAG0043 


421 


phosphoribosylamine— glycine ligase 


SAG0044 


162 


phosphoribosylaminoimidazole carboxylase, catalytic subunit 


SAG0045 


363 


phosphoribosylaminoimidazole carboxylase, ATPase subunit 


SAG0046 


463 


membrane protein, putative 



399 



WO 2004/018646 



PCT/US2003/026827 



Table 1: Complete list of GBS predicted genes 



ORF 


Size 
(a.a.) 


Annotation 


SAG0047 


432 


adenylosuccinate lyase 


SAG0048 


303 


transcriptional regulator, Cro/CI family 


SAG0049 


332 


Holliday junction DNA helicase RuvB 


SAG0050 


145 


phosphotyrosine protein phosphatase, low molecular weight 


SAG0051 


126 


MORN motif family protein 


SAG0052 


592 


membrane protein, putative 


SAG0053 


880 


aldehyde-alcohol dehydrogenase 


SAG0054 


338 


alcohol dehydrogenase, propanol-preferring 


SAG0055 


496 


threonine synthase 


SAG0056 


412 


MATE efflux family protein 


SAG0057 


102 


ribosomal protein S10 


SAG0058 


208 


ribosomal protein L3 


SAG0059 


207 


ribosomal protein L4 


SAG0060 


98 


ribosomal protein L23 


SAG0061 


277 


ribosomal protein L2 


SAG0062 


92 


ribosomal protein S19 


SAG0063 


114 


ribosomal protein L22 


SAG0064 


217 


ribosomal protein S3 


SAG0065 


137 


ribosomal protein LI 6 


SAG0066 


68 


ribosomal protein L29 


SAG0067 


86 


ribosomal protein SI 7 


SAG0068 


122 


ribosomal protein LI 4 


SAG0069 


101 


ribosomal protein L24 


SAG0070 


180 


ribosomal protein L5 


SAG0071 


61 


ribosomal protein SI 4, putative 


SAG0072 


132 


ribosomal protein S8 


SAG0073 


178 


ribosomal protein L6 


SAG0074 


118 


ribosomal protein LI 8 


SAG0075 


164 


ribosomal protein S5 


SAG0076 


59 


ribosomal protein L30 


SAG0077 


146 


ribosomal protein LI 5 


SAG0078 


434 


preprotein translocase, SecY subunit 


SAG0079 


212 


adenylate kinase 


SAG0080 


72 


translation initiation factor EF-1 


SAG0081 


38 


ribosomal protein L36 


SAG0082 


121 


ribosomal protein S13 


SAG0083 


118 


ribosomal protein SI 1 


SAG0084 


312 


DNA-directed RNA polymerase, alpha subunit 


SAG0085 


128 


ribosomal protein LI 7 


SAG0086 


85 


lipoprotein, putative 


SAG0087 


59 


hypothetical protein 


SAG0088 


56 


hypothetical protein 


SAG0089 


183 


conserved hypothetical protein 


SAG0090 


139 


conserved hypothetical protein 


SAG0091 


144 


transcriptional regulator ComXl, putative 


SAG0092 


230 


phosphoglycerate mutase family protein 


SAG0093 


250 


D-alanyl-D-alanine carboxypeptidase family protein 


SAG0094 


191 


N-acetylmuramoyl-L-alanine amidase, family 4 protein 
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SAG0095 


344 


heat-inducible transcription repressor HrcA 


SAG0096 


190 


heat shock protein GrpE 


SAG0097 


609 


dnaK protein 


SAG0098 


379 


dnaJ protein 


SAG0099 


415 


transcriptional regulator, GntR family 


SAGO 100 


258 


tRNA pseudouridine synthase A 


SAG0101 


252 


phosphomethylpyrimidine kinase, putative 


SAGO 102 


154 


conserved hypothetical protein 


SAGO 103 


189 


conserved hypothetical protein TIGR01440 


SAGO 104 


280 


conserved hypothetical protein 


SAGO 105 


427 


trigger factor 


SAG0106 


191 


DNA-directed RNA polymerase, delta subunit, putative 


SAG0107 


534 


CTP synthase 


SAG0108 


308 


conserved hypothetical protein 


SAG0109 


148 


deoxyuridine 5' -triphosphate nucleotidohydrolase 


SAG0110 


454 


DNA repair protein RadA 


SAG0111 


165 


carbonic anhydrase-related protein 


SAG0112 


439 


pyridine nucleotide-disulphide oxidoreductase family protein 


SAG0113 


484 


glutamyl-tRNA synthetase 


SAG0114 


322 


ribose ABC transporter, periplasmic D-ribose-binding protein 


SAG0115 


310 


ribose ABC transporter, permease protein 


SAG0116 


492 


ribose ABC transporter, ATP-binding protein 


SAG0117 


132 


ribose ABC transporter protein RbsD 


SAG0118 


303 


ribokinase 


SAG0119 


328 


ribose operon repressor RbsR 


SAG0120 


32 


hypothetical protein 


SAGO 121 


362 


permease, putative 


SAG0122 


228 


ABC transporter, ATP-binding protein 


SAG0123 


223 


DNA-binding response regulator 


SAG0124 


356 


sensor histidine kinase 


SAG0125 


396 


argininosuccinate synthase 


SAG0126 


462 


argininosuccinate lyase 


SAG0127 


293 


fructose-bisphosphate aldolase 


SAG0128 


305 


L-2-hydroxyisocaproate dehydrogenase 


SAG0129 


62 


ribosomal protein L28 


SAG0130 


121 


conserved hypothetical protein 


SAG0131 


543 


DAK2 domain protein 


SAGO 132 


294 


SPFH domain/Band 7 family protein 


SAG0133 


38 


conserved hypothetical protein 


SAGO 134 


96 


hypothetical protein 


SAGO 135 


246 


amino acid ABC transporter, ATP-binding protein 


SAG0136 


516 


amino acid ABC transporter, amino acid-binding protein/permease 
protein 


SAG0.137 


627 


conserved hypothetical protein 


SAGO 138 


279 


undecaprenol kinase, putative 


SAGO 139 


251 


negative regulator of competence MecA, putative 


SAG0140 


386 


glycosyl transferase, group 4 family protein 


SAG0141 


256 


ABC transporter, ATP-binding protein 
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SAG0142 


420 


conserved hypothetical protein 


SAG0143 


410 


selenocysteine lyase 


SAG0144 


147 


NifU family protein 


SAG0145 


472 


conserved hypothetical protein 


SAGO 146 


395 


penicillin-binding protein 4, putative 


SAG0147 


411 


D-alanyl-D-alanine carboxypeptidase family protein 


SAGO 148 


551 


oligopeptide ABC transporter, substrate-binding protein, putative 


SAG0149 


304 


oligopeptide ABC transporter, permease protein 


SAGO 150 


343 


oligopeptide ABC transporter, permease protein 


SAG0151 


348 


oligopeptide ABC transporter, ATP-binding protein 


SAG0152 


310 


oligopeptide ABC transporter, ATP-binding protein 


SAGO 153 


283 


4-diphosphocytidyl-2C-methyl-D-erythritol kinase 


SAG0154 


147 


adc operon repressor AdcR 


SAG0155 


236 


zinc ABC transporter, ATP-binding protein 


SAG0156 


270 


zinc ABC transporter, permease protein 


SAGO 157 


NA 


deoxyribonuclease-related protein, degenerate 


SAG0158 


419 


tyrosyl-tRNA synthetase 


SAGO 159 


765 


penicillin-binding protein IB, putative 


SAGO 160 


1191 


DNA-directed RNA polymerase, beta subunit 


SAG0161 


1216 


DNA-directed RNA polymerase beta' subunit 


SAGO 162 


121 


conserved hypothetical protein 


SAGO 163 


323 


competence protein CglA 


SAGO 164 


282 


competence protein CglB 


SAGO 165 


151 


conserved hypothetical protein 


SAGO 166 


123 


conserved domain protein 


SAGO 167 


324 


conserved hypothetical protein 


SAGO 168 


397 


acetate kinase 


SAGO 169 


68 


transcriptional regulator, Cro/CI family 


SAGO 170 


45 


hypothetical protein 


SAG0171 


151 


hypothetical protein 


SAGO 172 


221 


protease, putative 


SAGO 173 


256 


pyrroline-5-carboxylate reductase 


SAG0174 


355 


glutamyl-aminopeptidase 


SAG0175 


79 


hypothetical protein 


SAGO 176 


94 


conserved hypothetical protein 


SAG0177 


107 


thioredoxin family protein 


SAGO 178 


208 


tRNA binding domain protein 


SAG0179 


238 


conserved hypothetical protein 


SAG0180 


131 


single-strand binding protein 


SAG0181 


214 


hydrolase, haloacid dehalogenase-like family 


SAG0182 


581 


sensor histidine kinase, putative 


SAGO 183 


246 


response regulator 


SAGO 184 


151 


conserved hypothetical protein 


SAG0185 


242 


membrane protein, putative 


SAGO 186 


36 


hypothetical protein 


SAGO 187 


542 


oligopeptide ABC transporter, oligopeptide-binding protein 


SAGO 188 


325 


oligopeptide ABC transporter, permease protein 


SAGO 189 


273 


oligopeptide ABC transporter, permease protein 
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SAG0190 


267 


peptide ABC transporter, ATP-binding protein 


SAG0191 


208 


peptide ABC transporter, ATP-binding protein 


SAGO 192 


676 


PTS system, IIABC components 


SAGO 193 


541 


alpha amylase family protein 


SAGO 194 


639 


transcriptional antiterminator, BglG family 


SAG0195 


377 


IS 1548, transposase 


SAGO 196 


66 


conserved domain protein 


SAGO 197 


94 


PTS system, IIB component, putative 


SAG0198 


451 


PTS system, IIC component, putative 


SAGO 199 


285 


transketolase, N-terminal subunit 


SAG0200 


309 


transketolase, C-terminal subunit 


SAG0201 


419 


oxidoreductase, putative 


SAG0202 


89 


ribosomal protein S 1 5 


SAG0203 


709 


polyribonucleotide nucleotidyltransferase 


SAG0204 


250 


conserved hypothetical protein 


SAG0205 


194 


serine O-acetyltransferase 1 


SAG0206 


60 


lipoprotein, putative 


SAG0207 


447 


cysteinyl-tRNA synthetase 


SAG0208 


128 


conserved hypothetical protein 


SAG0209 


251 


RNA methyltransferase, TrmH family, group 3 


SAG0210 


172 


conserved hypothetical protein 


SAG0211 


286 


DegV family protein 


SAG0212 


32 


hypothetical protein 


SAG0213 


39 


hypothetical protein 


SAG0214 


148 


ribosomal protein LI 3 


SAG0215 


130 


ribosomal protein S9 


SAG0216 


33 


hypothetical protein 


SAG0217 


384 


site-specific recombinase, phage integrase family 


SAG0218 


158 


transcriptional regulator, Cro/CI family 


SAG0219 


101 


hypothetical protein 


SAG0220 


92 


conserved hypothetical protein 


SAG0221 


76 


hypothetical protein 


SAG0222 


108 


conserved domain protein 


SAG0223 


209 


conserved hypothetical protein, fusion 


SAG0224 


332 


replication initiation protein, putative 


SAG0225 


144 


hypothetical protein 


SAG0226 


418 


recombination protein 


SAG0227 


156 


hypothetical protein 


SAG0228 


111 


conserved hypothetical protein 


SAG0229 


95 


conserved hypothetical protein 


SAG0230 


96 


conserved hypothetical protein 


SAG0231 


135 


hypothetical protein 


SAG0232 


186 


hypothetical protein 


SAG0233 


226 


hypothetical protein 


SAG0234 


128 


hypothetical protein 


SAG0235 


93 


hypothetical protein 


SAG0236 


32 


hypothetical protein 


SAG0237 


34 


hypothetical protein 
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SAG0238 


41 


hypothetical protein 


SAG0239 


286 


transcriptional regulator MutR family 


SAG0240 


393 


transporter, putative 


SAG0241 


213 


amino acid ABC transporter, permease protein 


SAG0242 


308 


amino acid ABC transporter, amino acid-binding protein 


SAG0243 


211 


amino acid ABC transporter, permease protein 


SAG0244 


381 


amino acid ABC transporter, ATP-binding protein 


SAG0245 


152 


protein of unknown function/lipoprotein, putative 


SAG0246 


268 


hypothetical protein 


SAG0247 


116 


hypothetical protein 


SAG0248 


90 


hypothetical protein 


SAG0249 


116 


hypothetical protein 


SAG0250 


193 


membrane protein, putative 


SAG0251 


72 


transcriptional regulator, Cro/CI family 


SAG0252 


186 


acetyltransferase, GNAT family 


SAG0253 


192 


acetyltransferase, GNAT family 


SAG0254 


226 


acetyltransferase, GNAT family 


SAG0255 


315 


conserved hypothetical protein 


SAG0256 


163 


RNA polymerase sigma factor, ECF subfamily 


SAG0257 


53 


lipoprotein, putative 


SAG0258 


202 


transcriptional regulator, TetR family 


SAG0259 


365 


ABC transporter efflux protein, DrrB family, putative 


SAG0260 


238 


ABC transporter, ATP-binding protein 


SAG0261 


129 


IS1381, transposase OrfB 


SAG0262 


127 


IS1381, transposase OrfA 


SAG0263 


171 


hypothetical protein 


SAG0264 


103 


conserved hypothetical protein 


SAG0265 


235 


conserved hypothetical protein 


SAG0266 


382 


N-acetylglucosamine-6-phosphate deacetylase 


SAG0267 


180 


conserved hypothetical protein 


SAG0268 


304 


glycyl-tRNA synthetase, alpha subunit 


SAG0269 


213 


acyl carrier protein phosphodiesterase, putative 


SAG0270 


679 


glycyl-tRNA synthetase, beta subunit 


SAG0271 


85 


conserved hypothetical protein 


SAG0272 


87 


membrane protein, putative 


SAG0273 


502 


glycerol kinase 


SAG0274 


609 


alpha-glycerophosphate oxidase 


SAG0275 


232 


glycerol uptake facilitator protein 


SAG0276 


445 


NADH oxidase, putative 


SAG0277 


476 


conserved hypothetical protein 


SAG0278 


661 


transketolase 


SAG0279 


101 


conserved hypothetical protein 


SAG0280 


244 


ABC transporter, ATP-binding protein 


SAG0281 


534 


membrane protein, putative 


SAG0282 


461 


PTS system, IIBC components 


SAG0283 


267 


glutamate 5-kinase 


SAG0284 


417 


gamma-glutamyl phosphate reductase 


SAG0285 


298 


conserved hypothetical protein TIGR00006 
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SAG0286 


" 108 


cell division protein FtsL, putative 


SAG0287 


752 


penicillin-binding protein 2X 


SAG0288 


336 


phospho-N-acetylmuramoyl-pentapeptide-transferase 


SAG0289 


447 


ATP-dependent RNA helicase, DEAD/DEAH box family 


SAG0290 


270 


ABC transporter, substrate-binding protein 


SAG0291 


267 


amino acid ABC transporter, permease protein 


SAG0292 


247 


amino acid ABC transporter, ATP-binding protein 


SAG0293 


74 


conserved hypothetical protein 


SAG0294 


304 


thioredoxin reductase 


SAG0295 


486 


conserved hypothetical protein 


SAG0296 


273 


NAD synthetase 


SAG0297 


444 


aminopeptidase C 


SAG0298 


750 


penicillin-binding protein 1A 


SAG0299 


199 


recombination protein U 


SAG0300 


172 


conserved hypothetical protein 


SAG0301 


40 


hypothetical protein 


SAG0302 


110 


conserved hypothetical protein 


SAG0303 


384 


conserved hypothetical protein 


SAG0304 


487 


conserved hypothetical protein 


SAG0305 


160 


autoinducer-2 production protein LuxS 


SAG0306 


535 


KH domain protein 


SAG0307 


33 


hypothetical protein 


SAG0308 


298 


ABC transporter, ATP-binding protein 


SAG0309 


246 


ABC transporter, permease protein, putative 


SAG0310 


361 


conserved hypothetical protein 


SAG0311 


NA 


DNA-binding response regulator, authentic point mutation 


SAG0312 


234 


conserved hypothetical protein 


SAG0313 


209 


guanylate kinase 


SAG0314 


104 


DNA-directed RNA polymerase, omega subunit, putative 


SAG0315 


796 


primosomal protein N f 


SAG0316 


311 


methionyl-tRNA formyltransferase 


SAG0317 


[ 440 


sun protein 


SAG0318 


245 


serine/threonine phosphatase, putative 


SAG0319 


651 


serine/threonine protein kinase 


SAG0320 


231 


conserved hypothetical protein 


SAG0321 


339 


sensor histidine kinase, putative 


SAG0322 


213 


DNA-binding response regulator 


SAG0323 


466 


hydrolase, haloacid dehalogenase family/peptidyl-prolyl cis-trans 
isomerase, cyclophilin type 


SAG0324 


124 


general stress protein, putative 


SAG0325 


258 


pyruvate formate-lyase-activating enzyme 


SAG0326 


251 


transcriptional regulator, DeoR family 


SAG0327 


327 


transcriptional regulator, putative 


SAG0328 


107 


PTS system, cellobiose-specific IIA component 


SAG0329 


106 


PTS system, cellobiose-specific IIB component 


SAG0330 


433 


PTS system, cellobiose-specific IIC component 


SAG0331 


818 


formate acetyltransferase 


SAG0332 


222 


transaldolase family protein 
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SAG0333 


362 


glycerol dehydrogenase 


SAG0334 


308 


cysteine synthase A 


SAG0335 


214 


conserved hypothetical protein TIGR00257 


SAG0336 


429 


helicase, putative 


SAG0337 


221 


competence protein F, putative 


SAG0338 


184 


ribosomal subunit interface protein 


SAG0339 


450 


aspartate kinase family protein 


SAG0340 


216 


hydrolase, haloacid dehalogenase-like family 


SAG0341 


49 


hypothetical protein 


SAG0342 


263 


enoyl-CoA hydratase/isomerase family protein 


SAG0343 


144 


transcriptional regulator, MarR family 


SAG0344 


323 


3-oxoacyl-(acyl-carrier-protein) synthase III 


SAG0345 


74 


acyl carrier protein 


SAG0346 


319 


enoyl-(acyl-carrier-protein) reductase II 


SAG0347 


308 


malonyl CoA-acyl carrier protein transacylase 


SAG0348 


244 


3 -oxoacyl- [acyl-c arrier protein] reductase 


SAG0349 


410 


3-oxoacyl-(acyl-carrier-protein) synthase II 


SAG0350 


166 


acetyl-CoA carboxylase, biotin carboxyl carrier protein 


SAG0351 


140 


(3R)-hydroxymyristoyl-(acyl-carrier-protein) dehydratase 


SAG0352 


456 


acetyl-CoA carboxylase, biotin carboxylase 


SAG0353 


291 


acetyl-CoA carboxylase, carboxyl transferase, beta subunit 


SAG0354 


257 


acetyl-CoA carboxylase, carboxyl transferase, alpha subunit 


SAG0355 


210 


conserved hypothetical protein 


SAG0356 


425 


seryl-tRNA synthetase 


SAG0357 


330 


membrane protein, putative 


SAG0358 


120 


conserved hypothetical protein 


SAG0359 


303 


PTS system, mannose-specific IID component 


SAG0360 


270 


PTS system, mannose-specific IIC component 


SAG0361 


336 


PTS system, mannose-specific IIAB components 


SAG0362 


270 


hydrolase, haloacid dehalogenase-like family 


SAG0363 


194 


hypothetical protein 


SAG0364 


203 


membrane protein, putative 


SAG0365 


473 


xanthine/uracil permease family protein 


SAG0366 


169 


conserved hypothetical protein TIGR00150 


SAG0367 


186 


acetyltransferase, GNAT family 


SAG0368 


435 


protein of unknown function 


SAG0369 


98 


conserved hypothetical protein 


SAG0370 


139 


HIT family protein 


SAG0371 


167 


hypothetical protein 


SAG0372 


85 


hypothetical protein 


SAG0373 


241 


ABC transporter, ATP-binding protein 


SAG0374 


344 


ABC transporter, permease protein 


SAG0375 


266 


conserved hypothetical protein 


SAG0376 


211 


conserved hypothetical protein TIGR00091 


SAG0377 


127 


conserved hypothetical protein 


SAG0378 


379 


N utilization substance protein A 


SAG0379 


98 


conserved hypothetical protein 


SAG0380 


100 


ribosomal protein L7A family 
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SAG0381 


927 


translation initiation factor IF-2 


SAG0382 


122 


ribosome-binding factor A 


SAG0383 


334 


protein of unknown function/lipoprotein, putative 


SAG0384 


138 


transcriptional repressor CopY 


SAG0385 


744 


copper-transporter ATPase CopA 


SAG0386 


68 


copper-transporter protein CopZ 


SAG0387 


204 


membrane protein, putative 


SAG0388 


270 


hydrolase, haloacid dehalogenase-like family 


SAG0389 


880 


DNA polymerase I 


SAG0390 


146 


CoA-binding domain protein 


SAG0391 


159 


transcriptional regulator, Fur family 


SAG0392 


521 


cell wall surface anchor family protein 


SAG0393 


228 


DNA-binding response regulator 


SAG0394 


345 


sensor histidine kinase 


SAG0395 


246 


membrane protein, putative 


SAG0396 


380 


queuine tRNA-ribosyltransferase 


SAG0397 


102 


conserved hypothetiqal protein 


SAG0398 


179 


BioY family protein 


SAG0399 


258 


AtsA/ElaC family protein 


SAG0400 


168 


cytidine/deoxycytidylate deaminase family protein 


SAG0401 


44 


hypothetical protein 


SAG0402 


449 


glucose-6-phosphate isomerase 


SAG0403 


175 


5-formyltetrahydrofolate cyclo-ligase family protein 


SAG0404 


225 


rhomboid family protein 


SAG0405 


347 


protein of unknown function/lipoprotein, putative 


SAG0406 


299 


UTP-glucose- 1 -phosphate uridylyltransferase 


SAG0407 


338 


glycerol-3 -phosphate dehydrogenase (NAD(P)+) 


SAG0408 


109 


ribonuclease P protein component 


SAG0409 


271 


SpoIIIJ family protein 


SAG0410 


273 


R3H domain protein 


SAG0411 


177 


conserved hypothetical protein 


SAG0412 


258 


recX protein 


SAG0413 


451 


RNA methyltransferase, TrmA family 


SAG0414 


153 


conserved hypothetical protein 


SAG0415 


142 


acetyltransferase, GNAT family 


SAG0416 


1233 


protease, putative 


SAG0417 


302 


glycosyl transferase, group 2 family protein 


SAG0418 


336 


ribonucleoside-diphosphate reductase 2, beta subunit 


SAG0419 


137 


nrdl protein 


SAG0420 


721 


ribonucleoside-diphosphate reductase 2, alpha subunit 


SAG0421 


1055 


cell wall surface anchor family protein 


SAG0422 


129 


conserved hypothetical protein 


SAG0423 


132 


conserved domain protein 


SAG0424 


94 


hypothetical protein 


SAG0425 


105 


carboxymuconolactone decarboxylase family protein 


SAG0426 


131 


conserved hypothetical protein 


SAG0427 


129 


transcriptional regulator, MerR family 


SAG0428 


345 


alcohol dehydrogenase, zinc-containing 
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SAG0429 


284 


oxidoreductase, aldo/keto reductase family 


SAG0430 


287 


cation efflux system protein 


SAG0431 


174 


transcriptional regulator, TetR family 


SAG0432 


397 


transcriptional regulator, AraC family 


SAG0433 


1389 


surface protein Rib 


SAG0434 


61 


transposase, IS256 family, truncation 


SAG0435 


97 


DNA-damage-inducible protein J, putative 


SAG0436 


62 


hypothetical protein 


SAG0437 


123 


lipoprotein, putative 


SAG0438 


145 


bacteriophage L54a, integrase, truncation 


SAG0439 


NA 


conserved hypothetical protein, degenerate 


SAG0440 


84 


conserved hypothetical protein 


SAG0441 


103 


conserved domain protein 


SAG0442 


189 


acetyltransferase, GNAT family 


SAG0443 


194 


acetyltransferase, GNAT family 


SAG0444 


188 


conserved hypothetical protein 


SAG0445 


883 


valyl-tRNA synthetase 


SAG0446 


319 


oxidoreductase, Gfo/Idh/MocA family 


SAG0447 


287 


magnesium transporter, CorA family 


SAG0448 


391 


transposase, IS256 family 


SAG0449 


354 


conserved hypothetical protein 


SAG0450 


330 


aspartate— ammonia ligase 


SAG0451 


149 


bacteriocin transport accessory protein, putative 


SAG0452 


179 


type II DNA modification methyltransferase, putative 


SAG0453 


96 


hypothetical protein 


SAG0454 


161 


phosphopantetheine adenylyltransferase 


SAG0455 


357 


conserved hypothetical protein 


SAG0456 


NA 


conserved hypothetical protein, degenerate 


SAG0457 


192 


conserved hypothetical protein 


SAG0458 


368 


conserved hypothetical protein TIGR00048 


SAG0459 


171 


VanZF domain protein 


SAG0460 


581 


ABC transporter, ATP-binding/permease protein 


SAG0461 


579 


ABC transporter, ATP-binding/permease protein 


SAG0462 


188 


anthranilate synthase component II 


SAG0463 


179 


BioY family protein 


SAG0464 


330 


biotin synthetase 


SAG0465 


164 


hypothetical protein 


SAG0466 


371 


thiolase 


SAG0467 


409 


AMP-binding enzyme domain protein 


SAG0468 


210 


endonuclease III 


SAG0469 


131 


type IV prepilin peptidase-related protein 


SAG0470 


69 


conserved hypothetical protein 


SAG0471 


322 


glucokinase 


SAG0472 


126 


rhodanese-like family protein 


SAG0473 


613 


elongation factor Tu family protein 


SAG0474 


81 


conserved hypothetical protein 


SAG0475 


451 


UDP-N-acetylmuramoylalanine— D-glutamate ligase ! 


SAG0476 


358 


UDP-N-acetylglucosamine— N-acetylmuramyl-(pentapeptide) 
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pyrophosphoryl-undecaprenol N-acetylglucosamine transferase 


SAG0477 


378 


cell division protein DivIB, putative 


SAG0478 


429 


cell division protein FtsA 


SAG0479 


426 


cell division protein FtsZ 


SAG0480 


224 


ylmE protein, putative 


SAG0481 


201 


ylmF protein 


SAG0482 


84 


YGGT family protein 


SAG0483 


262 


ylmH protein 


SAG0484 


256 


cell division protein DivIVA, putative 


SAG0485 


930 


isoleucyl-tRNA synthetase 


SAG0486 


100 


conserved hypothetical protein 


SAG0487 


151 


MutT/nudix family protein 


SAG0488 


753 


ATP-dependent Clp protease, ATP-binding subunit 


SAG0489 


34 


hypothetical protein 


SAG0490 


76 


conserved hypothetical protein 


SAG0491 


230 


amino acid ABC transporter, permease protein 


SAG0492 


244 


amino acid ABC transporter, ATP-binding protein 


SAG0493 


564 


phosphoglucomutase/phosphomannomutase family protein 


SAG0494 


284 


methylenetetrahydrofolate 

dehydrogenase/methenyltetrahydrofolate cyclohydrolase 


SAG0495 


278 


protein of unknown function 


SAG0496 


446 


exodeoxyribonuclease VII, large subunit 


SAG0497 


71 


exodeoxyribonuclease VII, small subunit 


SAG0498 


290 


geranyltranstransferase, putative 


SAG0499 


275 


hemolysin A 


SAG0500 


157 


arginine repressor ArgR, putative 


SAG0501 


552 


DNA repair protein RecN 


SAG0502 


278 


DegV family protein 


SAG0503 


279 


lipase/acylhydrolase 


SAG0504 


200 


conserved hypothetical protein 


SAG0505 


91 


DNA-binding protein HU 


SAG0506 


65 


hypothetical protein 


SAG0507 


310 


dihydroorotate dehydrogenase A 


SAG0508 


411 


beta-lactam resistance factor 


SAG0509 


403 


beta-lactam resistance factor 


SAG0510 


406 


murM protein, putative 


SAG0511 


270 


hydrolase, haloacid dehalogenase-like family 


SAG0512 


438 


HD domain protein 


SAG0513 


128 


conserved hypothetical protein 


SAG0514 


894 


cation-transporting ATPase, E1-E2 family 


SAG0515 


286 


conserved hypothetical protein 


SAG0516 


643 


fructose-1 ,6-bisphosphatase, putative 


SAG0517 


374 


iron-sulfur cluster-binding protein, putative 


SAG0518 


NA 


peptide chain release factor 2, programmed frameshift 


SAG0519 


230 


cell division ABC transporter, ATP-binding protein FtsE 


SAG0520 


309 


cell division ABC transporter, permease protein FtsX 


SAG0521 


236 


carboxymethylenebutenolidase-related protein 


SAG0522 


232 


metallo-beta-lactamase superfamily protein 
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SAG0523 


254 


oxidoreductase, short chain dehydrogenase/reductase family 


SAG0524 


835 


DNA polymerase III, epsilon subunit/ATP-dependent helicase 
DinG 


SAG0525 


v 397 


aspartate aminotransferase 


SAG0526 


448 


asparaginyl-tRNA synthetase 


SAG0527 


185 


conserved hypothetical protein 


SAG0528 


327 


inosine-uridine preferring nucleoside hydrolase 


SAG0529 


38 


hypothetical protein 


SAG0530 


137 


OsmC/Ohr family protein 


SAG0531 


296 


conserved hypothetical protein 


SAG0532 


324 


conserved hypothetical protein 


SAG0533 


303 


conserved hypothetical protein 


SAG0534 


465 


dipeptidase 


SAG0535 


506 


zinc ABC transporter, zinc-binding adhesion liprotein 


SAG0536 


86 


ribosomal protein L31 


SAG0537 


311 


DHH family protein 


SAG0538 


340 


adenosine deaminase, putative 


SAG0539 


147 


flavodoxin 


SAG0540 


91 


chorismate mutase, putative 


SAG0541 


398 


voltage-gated chloride channel family protein 


SAG0542 


127 


IS 1381, transposase OrfA 


SAG0543 


129 


IS 1381, transposase OrfB 


SAG0544 


115 


ribosomal protein LI 9 


SAG0545 


359 


prophage LambdaSal , site-specific recombinase, phage integrase 
family 


SAG0546 


67 


conserved domain protein 


SAG0547 


185 


hypothetical protein 


SAG0548 


265 


prophage LambdaSal, repressor protein, putative 


SAG0549 


47 


hypothetical protein 


SAG0550 


74 


conserved hypothetical protein 


SAG0551 


52 


conserved hypothetical protein 


SAG0552 


62 


hypothetical protein 


SAG0553 


268 


hypothetical protein 


SAG0554 


63 


prophage LambdaSal, transcriptional regulator, Cro/CI family 


SAG0555 


249 


prophage LambdaSal, antirepressor, putative 


SAG0556 


47 


hypothetical protein 


SAG0557 


76 


hypothetical protein 


SAG0558 


74 


hypothetical protein 


SAG0559 


286 


conserved hypothetical protein 


SAG0560 


77 


conserved hypothetical protein 


SAG0561 


46 


hypothetical protein 


SAG0562 


84 


hypothetical protein 


SAG0563 


53 


hypothetical protein 


SAG0564 


160 


conserved hypothetical protein 


SAG0565 


224 


conserved domain protein 


SAG0566 


138 


prophage LambdaSal, single-strand binding protein 


SAG0567 


439 


prophage LambdaSal, reverse transcriptase/maturase family 
protein 



410 



WO 2004/018646 



PCT/US2003/026827 



Table 1: Complete list of GBS predicted genes 



ORF 


Size 
(a.a.) 


Annotation 


SAG0568 


67 


conserved hypothetical protein 


SAG0569 


158 


conserved hypothetical protein 


SAG0570 


115 


hypothetical protein 


SAG0571 


43 


hypothetical protein 


SAG0572 


138 


conserved hypothetical protein 


SAG0573 


54 


hypothetical protein 


SAG0574 


89 


conserved hypothetical protein 


SAG0575 


110 


hypothetical protein 


SAG0576 


43 


hypothetical protein 


SAG0577 


177 


conserved hypothetical protein 


SAG0578 


88 


conserved hypothetical protein 


SAG0579 


142 


conserved hypothetical protein 


SAG0580 


111 


conserved hypothetical protein, truncation 


SAG0581 


118 


conserved hypothetical protein 


SAG0582 


422 


conserved hypothetical protein 


SAG0583 


406 


conserved hypothetical protein 


SAG0584 


62 


conserved hypothetical protein, truncation 


SAG0585 


471 


conserved hypothetical protein 


SAG0586 


154 


conserved hypothetical protein 


SAG0587 


300 


prophage LambdaSal, structural protein, putative 


SAG0588 


71 


conserved hypothetical protein 


SAG0589 


143 


conserved hypothetical protein 


SAG0590 


112 


conserved hypothetical protein 


SAG0591 


78 


conserved hypothetical protein 


SAG0592 


111 


conserved hypothetical protein 


SAG0593 


185 


prophage LambdaSal, structural protein 


SAG0594 


81 


conserved hypothetical protein 


SAG0595 


123 


conserved hypothetical protein 


SAG0596 


670 


prophage LambdaSal, pbl A protein, internal deletion 


SAG0597 


506 


prophage LambdaSal, minor structural protein, putative 


SAG0598 


1374 


prophage LambdaSal, N-acetylmuramoyl-L-alanine amidase, 
family 4 


SAG0599 


668 


prophage LambdaSal, minor structural protein, putative 


SAG0600 


109 


hypothetical protein 


SAG0601 


70 


hypothetical protein 


SAG0602 


100 


conserved hypothetical protein 


SAG0603 


111 


conserved hypothetical protein 


SAG0604 


239 


prophage LambdaSal, lysin, putative 


SAG0605 


323 


conserved hypothetical protein 


SAG0606 


66 


conserved hypothetical protein 


SAG0607 


56 


conserved hypothetical protein 


SAG0608 


59 


hypothetical protein 


SAG0609 


NA 


prophage LambdaSal, integrase, degenerate 


SAG0610 


134 


conserved hypothetical protein 


SAG0611 


NA 


transposase, degenerate 


SAG0612 


53 


conserved hypothetical protein 


SAG0613 


425 


transmembrane protein Vexpl 


SAG0614 


218 


ABC transporter, ATP-binding protein Vexp2 
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SAG0615 


458 


transmembrane protein Vexp3 


SAG0616 


217 


DNA-binding response regulator VncR 


SAG0617 


439 


sensor histidine kinase VncS 


SAG0618 


195 


transposase OrfB, IS3 family, truncation 


SAG0619 


66 


conserved hypothetical protein 


SAG0620 


62 


hypothetical protein 


SAG0621 


401 


rod shape-determining protein RodA, putative □ 


SAG0622 


186 


hydrolase, haloacid dehalogenase-like family 


SAG0623 


650 


DNA gyrase, B subunit 


SAG0624 


574 


septation ring formation regulator EzrA, putative 


SAG0625 


213 


phosphoserine phosphatase SerB 


SAG0626 


161 


MutT/nudix family protein 


SAG0627 


151 


conserved hypothetical protein 


SAG0628 


435 


enolase 


SAG0629 


354 


conserved domain protein 


SAG0630 


427 


3-phosphoshikimate 1 -carboxyvinyltransferase 


SAG0631 


170 


shikimate kinase 


SAG0632 


457 


psr protein 


SAG0633 


451 


RNA methyltransferase, TrmA family 


SAG0634 


70 


hypothetical protein 


SAG0635 


245 


acid phosphatase, class B 


SAG0636 


172 


conserved hypothetical protein 


SAG0637 


NA 


transcriptional regulator, TetR family, putative, authentic 






frameshift 


SAG0638 


109 


cell wall surface anchor family protein, truncation 


SAG0639 


273 


transposase OrfB, IS3 family 


SAG0640 


91 


transposase OrfA, IS3 family 


SAG0641 


NA 


Tn5252, Orf 10 protein, degenerate 


SAG0642 


59 


hypothetical protein 


SAG0643 


NA 


chaperonin, 33 kDa, degenerate 


SAG0644 i 


402 


transcriptional regulator, AraC family 


SAG0645 


554 


cell wall surface anchor family protein 


SAG0646 


307 


cell wall surface anchor family protein 


SAG0647 


305 


sortase family protein 


SAG0648 


260 


sortase family protein 


SAG0649 


890 


cell wall surface anchor family protein, putative 


SAG0650 


189 


sortase family protein 


SAG0651 


. 201 


protein of unknown function 


SAG0652 


NA 


Tn5252, Orf 28 protein, degenerate 


SAG0653 


NA 


conserved hypothetical protein, degenerate 


SAG0654 


34 


hypothetical protein 


SAG0655 


57 


conserved hypothetical protein 


SAG0656 


36 


hypothetical protein 


SAG0657 


89 


hypothetical protein 


SAG0658 


383 


lipoprotein, putative 


SAG0659 


330 


ABC transporter, ATP-binding protein 


SAG0660 


272 


membrane protein 


SAG0661 


261 


conserved hypothetical protein 
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SAG0662 


101 


cylX protein 


SAG0663 


282 


cylD protein 


SAG0664 


240 


cylG protein 


SAG0665 


ioi 


acyl carrier protein AcpC 


SAG0666 


158 


cylZ protein 


SAG0667 


309 


cylA protein 


SAG0668 


292 


cylB protein 


SAG0669 


667 


cylE protein 


SAG0670 


317 


cylF protein 


SAG0671 


731 


cyll protein 


SAG0672 


403 


cyl J protein 


SAG0673 


191 


cylK protein 


SAG0674 


113 


hypothetical protein 


SAG0675 


171 


putative secreted protein 


SAG0676 


885 


proteinase, putative 


SAG0677 


1062 


hypothetical protein 


SAG0678 


NA 


endopeptidase O, degenerate 


SAG0679 


343 


protein of unknown function 


SAG0680 


339 


protein of unknown function 


SAG0681 


353 


conserved domain protein 


SAG0682 


409 


permease, putative 


SAG0683 


NA 


transmembrane protein Vexp3, putative, degenerate 


SAG0684 


223 


ABC transporter, ATP-binding protein 


SAG0685 


472 


conserved hypothetical protein 


SAG0686 


261 


DNA-entry nuclease, putative 


SAG0687 


212 


DedA family protein, putative 


SAG0688 


218 


ABC transporter, ATP-binding protein 


SAG0689 


257 


membrane protein, putative 


SAG0690 


272 


conserved hypothetical protein 


SAG0691 


294 


transcriptional regulator, LysR family 


SAG0692 


193 


regulatory protein, putative 


SAG0693 


377 


IS1548, transposase 


SAG0694 


173 


regulatory protein, putative, truncation 


SAG0695 


330 


D-lactate dehydrogenase 


SAG0696 


516 


sodium: galactoside symporter family protein, putative 


SAG0697 


341 


2-keto-3 -deoxy gluconate kinase 


SAG0698 


599 


beta-glucuronidase 


SAG0699 


223 


transcriptional regulator, GntR family 


SAG0700 


205 


2-dehydro-3 -deoxyphosphogluconate aldolase/4 -hydroxy-2- 
oxoglutarate aldolase 


SAG0701 


466 


glucuronate isomerase 


SAG0702 


348 


mannonate dehydratase 


SAG0703 


279 


D-mannonate oxidoreductase 


SAG0704 


270 


hydrolase, haloacid dehalogenase-like family 


SAG0705 


596 


glycosyl hydrolase, family 3 


SAG0706 


361 


proline dipeptidase 


SAG0707 


334 


transcriptional regulator, RegM family 


SAG0708 


488 


alpha amylase family protein 



413 



WO 2004/018646 



PCT/US2003/026827 



Table 1: Complete list of GBS predicted genes 



ORF 


Size 
(a.a.) 


Annotation 


SAG0709 


332 


glycosyl transferase, group 1 family protein 


SAG0710 


444 


glycosyl transferase, group 1 family protein 


SAG0711 


647 


threonyl-tRNA synthetase 


SAG0712 


234 


DNA-binding response regulator 


SAG0713 


339 


conserved hypothetical protein 


SAG0714 


188 


conserved hypothetical protein 


SAG0715 


216 


amino acid ABC transporter, permease protein 


SAG0716 


231 


amino acid ABC transporter, permease protein 

— ±S- i- 


SAG0717 


266 


amino acid ABC transporter, amino acid-binding protein 


SAG0718 


251 


amino acid ABC transporter, ATP-binding protein. 

£_ Z S_£l 


SAG0719 


236 


DNA-binding response regulator 

—2 £. £2 


SAG0720 


449 


sensory box histidine kinase 


SAG0721 


269 


metallo-beta-lactamase superfamily protein 


SAG0722 


122 


conserved hypothetical protein 


SAG0723 


236 


ribonuclease III 


SAG0724 


1179 


chromosome segregation SMC protein 


SAG0725 


265 


hydrolase, haloacid dehalogenase-like family 


SAG0726 


274 


hydrolase, haloacid dehalogenase-like family 


SAG0727 


536 


signal recognition particle-docking protein FtsY 


SAG0728 


270 


ABC transporter, substrate-binding protein 


SAG0729 


300 


ABC transporter, permease protein, putative 


SAG0730 


42 


ABC transporter, ATP-binding protein 


SAG0731 


347 


bacterial luciferase family protein 


SAG0732 


720 


transcriptional accessory protein Tex, putative 


SAG0733 


142 


conserved hypothetical protein 


SAG0734 


87 


phage shock protein C, putative 


SAG0735 


44 


hypothetical protein [ 


SAG0736 


311 


HPr(Ser) kinase/phosphatase 


SAG0737 


257 


prolipoprotein diacylglyceryl transferase 


SAG0738 


132 


conserved hypothetical protein 


SAG0739 


143 


conserved hypothetical protein 


SAG0740 


91 


conserved hypothetical protein 


SAG0741 


303 


peptidase, U32 family, putative 


SAG0742 


428 


peptidase, U32 family 


SAG0743 


70 


conserved hypothetical protein 


SAG0744 


265 


membrane protein, putative 


SAG0745 


446 


Mn2+/Fe2+ transporter, NRAMP family 


SAG0746 


369 


riboflavin biosynthesis protein RibD 


SAG0747 


208 


riboflavin synthase, alpha subunit 


SAG0748 


397 


riboflavin biosynthesis protein RibA 


SAG0749 


156 


riboflavin synthase, beta subunit 


SAG0750 


496 


lysyl-tRNA synthetase 


SAG0751 


300 


hydrolase, haloacid dehalogenase-like family 


SAG0752 


213 


phosphoglycerate mutase family protein 


SAG0753 


157 


ebsC family protein, putative 


SAG0754 


205 


conserved domain protein 


SAG0755 


282 


peptidase, U32 family 


SAG0756 


174 


conserved hypothetical protein 
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SAG0757 


129 


protein of unknown function/lipoprotein, putative 


SAG0758 


599 


oligoendopeptidase F 5 putative 


SAG0759 


931 


phosphoenolpyruvate carboxylase 


SAG0760 


377 


IS 1548, transposase 


SAG0761 


422 


cell division nrotein. FtsW/RodA/SnoVE familv 


SAG0762 


398 


translation eloneation factor Tu 


SAG0763 


252 


trios6t)hosr>hate isomerase 


SAG0764 


230 


rVhosnhoplvcerflte mutase familv TYrntein 


SAG0765 


681 


nerncillin-hiTidiTii? Tiroteiti 


SAG0766 


198 


recombination nrotein ReeR 


SAG0767 


348 


T")-al3nine— 'D-»1anine licracif* 


SAG0768 


455 


T TOP-TsJ-a cpfvlm lira m nvl a 1 an vl -T^)- cli i tamvl -9 ^-Hiarninor^imflnff^ 

w jlvx i^t <x\s\*\.y xxxxkxxcxxxxkj y xaxcxxxy x is gxiiicxxxxy x jJ^kj viim.ixiiiiwjl/1111C/IcILv? 

D-alanyl-D-alanyl ligase 


SAG0769 


406 


oxalate i formate antiporter 


SAG0770 


228 


membrane protein, putative 


SAG0771 


512 


cell wall surface anchor family protein 


SAG0772 


514 


peptide chain release factor 3 


SAG0773 


126 


conserved hypothetical protein 


SAG0774 


244 


ABC transporter, ATP-binding protein 


SAG0775 


220 


ABC transporter, permease protein 


SAG0776 


276 


YaeC family protein, putative 


SAG0777 


528 


ATP-dependent RNA helicase, DEAD/DEAH box family 


SAG0778 


88 


conserved hypothetical protein 


SAG0779 


254 


conserved hypothetical protein 


SAG0780 


246 


acyltransferase family protein 


SAG0781 


217 


competence protein CelA 


SAG0782 


745 


DNA intemalization-related competence protein ComEC/Rec2 


SAG0783 


269 


hydrolase, haloacid dehalogenase-like family 


SAG0784 


314 


sugar-binding transcriptional regulator, Lad family 


SAG0785 


330 


conserved hypothetical protein 


SAG0786 


242 


conserved domain protein 


SAG0787 


345 


DNA polymerase III, delta subunit, putative 


SAG0788 


202 


superoxide dismutase, Fe-Mn 


SAG0789 


283 


transcriptional antiterminator LicT 


SAG0790 ! 


622 


PTS system, beta-elucosides-specific IIABC components 


SAG0791 


475 


6-phospho-beta-glucosidase 


SAG0792 


364 


conserved hypothetical protein 


SAG0793 


380 


glycerate kinase 2 


SAG0794 


418 


permease, GntP family 


SAG0795 


354 


conserved hypothetical protein 


SAG0796 


147 


transcriptional regulator, MarR family 


SAG0797 


342 


S-adenosylmethionine :tRNA ribosyltransferase-isomerase 


SAG0798 


226 


membrane protein, putative 


SAG0799 


233 


glucosamine-6-phosphate isomerase 


SAG0800 


318 


glutathione S-transferase family protein 


SAG0801 


239 


ribosomal small subunit pseudouridine synthase A 


SAG0802 


38 


hypothetical protein 


SAG0803 


383 


major facilitator family protein 
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SAG0804 


315 


competence protein CoiA 


SAG0805 


601 


oligoendopeptidase B 


SAG0806 


208 


hydrolase, haloacid dehalogenase-like family 


SAG0807 


235 


O-methyltransferase family protein 


SAG0808 


309 


protease maturation protein, putative 


SAG0809 


161 


conserved hypothetical protein 


SAG0810 


872 


alanyl-tRNA synthetase 


SAG0811 


238 


membrane protein, putative 


SAG0812 


272 


glycosyl transferase, family 8 


SAG0813 


81 


hypothetical protein 


SAG0814 


95 


conserved hypothetical protein 


SAG0815 


71 


transcriptional regulator, Cro/CI family 


SAG0816 


253 


membrane protein, putative 


SAG0817 


187 


membrane protein, putative 


SAG0818 


319 


ribonucleoside-diphosphate reductase 2, beta subunit 


SAG0819 


719 


ribonucleoside-diphosphate reductase 2, alpha subunit 


SAG0820 


74 


ribonucleoside-diphosphate reductase 2, NrdH-redoxin 


SAG0821 


87 


phosphocarrier protein HPr 


SAG0822 


577 


phosphoenolpyruvate-protein phosphotransferase 


SAG0823 


475 


glyceraldehyde-3-phosphate dehydrogenase, NADP-dependent 


SAG0824 


417 


polysaccharide deacetylase family protein 


SAG0825 


360 


ATP-dependent RNA helicase, DEAD/DEAH box family 


SAG0826 


209 


uridine kinase 


SAG0827 


165 


conserved hypothetical protein 


SAG0828 


554 


DNA polymerase III, gamma and tau subunits 


SAG0829 


64 


conserved hypothetical protein 


SAG0830 


311 


biotin—acetyl-CoA-carboxylase ligase 


SAG0831 


398 


S-adenosylmethionine synthetase 


SAG0832 


753 


protein of unknown function 


SAG0833 


181 


hypothetical protein 


SAG0834 


42 


hypothetical protein 


SAG0835 


188 


conserved hypothetical protein 


SAG0836 


184 


conserved hypothetical protein 


SAG0837 


428 


ABC transporter, ATP-binding protein 


SAG0838 


233 


hypothetical protein 


SAG0839 . 


226 


transcriptional regulator, TenA family 


SAG0840 


265 


phosphomethylpyrimidine kinase 


SAG0841 


256 


hydroxyethylthiazole kinase 


SAG0842 


223 


thiamine-phosphate pyrophosphorylase 


SAG0843 


419 


UDP-N-acetylglucosamine 1 -carboxyvinyltransferase 


SAG0844 


184 


acetyltransferase, GNAT family 


SAG0845 


427 


CBS domain protein 


SAG0846 


286 


methionine aminopeptidase, type I 


SAG0847 


306 


ribonuclease BN, putative 


SAG0848 


151 


GtrA family protein 


SAG0849 


169 


conserved hypothetical protein 


SAG0850 


652 


DNA ligase, NAD-dependent 


SAG0851 


339 


bmrU protein, putative 
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SAG0852 


766 


pullulanase, putative 


SAG0853 


622 


1 ,4-alpha-glucan branching enzyme 


SAG0854 


379 


glucose- 1 -phosphate adenylyltransferase 


SAG0855 


NA 


glycogen biosynthesis protein GlgD, authentic frameshift 


SAG0856 


476 


glycogen synthase 


SAG0857 


66 


ATP synthase F0, C subunit 


SAG0858 


238 


ATP synthase F0, A subunit 


SAG0859 


165 


ATP synthase F0, B subunit 


SAG0860 


178 


ATP synthase Fl, delta subunit 


SAG0861 


501 


ATP synthase Fl, alpha subunit 


SAG0862 


293 


ATP synthase Fl, gamma subunit 


SAG0863 


468 


ATP synthase Fl, beta subunit 


SAG0864 


137 


ATP synthase Fl, epsilon subunit 


SAG0865 


76 


conserved hypothetical protein 


SAG0866 


423 


UDP-N-acetylglucosamine 1 -carboxyvinyltransferase 


SAG0867 


63 


conserved hypothetical protein 


SAG0868 


285 


DNA-entry nuclease 


SAG0869 


346 


phenylalanyl-tRNA synthetase, alpha subunit 


SAG0870 


173 


acetyltransferase, GNAT family 


SAG0871 


801 


phenylalanyl-tRNA synthetase, beta subunit 


SAG0872 


300 


conserved hypothetical protein 


SAG0873 


1077 


exonuclease RexB 


SAG0874 


1207 


exonuclease RexA 


SAG0875 


305 


magnesium transporter, Cor A family, putative 


SAG0876 


458 


tRNA modification GTPase TrmE 


SAG0877 


636 


ABC transporter, ATP-binding protein 


SAG0878 


322 


acetoin dehydrogenase, thymine PPi dependent, El component, 
alpha subunit 


SAG0879 


332 


acetoin dehydrogenase, thymine PPi dependent, El component, 
beta subunit 


SAG0880 


462 


acetoin dehydrogenase, thymine PPi dependent, E2 component, 
dihydrolipoamide acetyltransferase 


SAG0881 


585 


acetoin dehydrogenase, thymine PPi dependent, E3 component, 
dihydrolipoamide dehydrogenase 


SAG0882 


329 


lipoate-protein ligase A 


SAG0883 


261 


cobyric acid synthase, putative 


SAG0884 


447 


mur ligase family protein 


SAG0885 


283 


conserved hypothetical protein TIGR00159 


SAG0886 


319 


protein of unknown function 


SAG0887 


450 


phosphoglucomutase/phosphomannomutase family protein 


SAG0888 


123 


conserved hypothetical protein 


SAG0889 


126 


conserved hypothetical protein 


SAG0890 


376 


oxygen-independent coproporphyrinogen III oxidase, putative 


SAG0891 


245 


conserved hypothetical protein 


SAG0892 


256 


hydrolase, haloacid dehalogenase-like family 


SAG0893 


218 


conserved hypothetical protein 


SAG0894 


1370 


protein of unknown function 


SAG0895 


289 


lipoyl-binding domain protein 
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SAG0896 


108 


oxidoreductase, putative 


SAG0897 


221 


conserved hypothetical protein 


SAG0898 


83 


hypothetical protein 


SAG0899 


57 


hypothetical protein 


SAG0900 


56 


hypothetical protein 


SAG0901 


127 


hypothetical protein 


SAG0902 


45 


hypothetical protein 


SAG0903 


44 


hypothetical protein 


SAG0904 


56 


hypothetical protein 


SAG0905 


138 


nucleoside diphosphate kinase 


SAG0906 


610 


GTP-binding protein LepA 


SAG0907 


877 


protein of unknown function/lipoprotein, putative 


SAG0908 


203 


HD domain protein 


SAG0909 


154 


acetyltransferase, GNAT family 


SAG0910 


144 


PilB-related protein 


SAG0911 


930 


cation-transporting ATPase, E1-E2 family 


SAG0912 


367 


nucleoside diphosphate kinase domain protein 


SAG0913 


212 


chloramphenicol acetyltransferase 


SAG0914 


203 


conserved hypothetical protein 


SAG0915 


405 


Tn916, transposase 


SAG0916 


67 


Tn91 6, excisionase 


SAG0917 


83 


Tn916, hypothetical protein 


SAG0918 


76 


Tn916, hypothetical protein 


SAG0919 


157 


Tn916, hypothetical protein 


SAG0920 


23 


Tn9 1 6, hypothetical protein 


SAG0921 


117 


Tn916, transcriptional regulator, putative 


SAG0922 


61 


Tn916, hypothetical protein 


SAG0923 


639 


Tn9 1 6, tetracycline resistance protein 


SAG0924 


28 


Tn916, tetM leader peptide 


SAG0925 


310 


Tn916, hypothetical protein 


SAG0926 


333 


Tn916, NLP/P60 family protein 


SAG0927 


725 


membrane protein, putative 


SAG0928 


NA 


Tn916, hypothetical protein, authentic frameshift 


SAG0929 


168 


Tn916, hypothetical protein 


SAG0930 


165 


Tn916, hypothetical protein 


SAG0931 


73 


Tn916, hypotheticaljprotein 


SAG0932 


401 


Tn916, transcriptional regulator, putative 


SAG0933 


461 


Tn916, FtsK/SpoIHE family protein 


SAG0934 


128 


Tn916, hypothetical protein 


SAG0935 


104 


Tn916, hypothetical protein 


SAG0936 


39 


Tn916, hypothetical protein 


SAG0937 


NA 


ABC transporter, ATP-binding protein, authentic frameshift 


SAG0938 


122 


transcriptional regulator, GntR family 


SAG0939 


1034 


DNA polymerase III, alpha subunit 


SAG0940 


340 


6-phosphofructokinase 


SAG0941 


500 


pyruvate kinase 


SAG0942 


185 


signal peptidase I, putative 


SAG0943 


47 


hypothetical protein 
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SAG0944 


604 


glucosamine--fructose-6-phosphate aminotransferase, isomerizing 


SAG0945 


377 


IS 1548, transposase 


SAG0946 


109 


phnA protein 


SAG0947 


213 


amino acid ABC transporter, permease protein 


SAG0948 


209 


amino acid ABC transporter, ATP-binding protein 


SAG0949 


276 


amino acid ABC transporter, amino acid-binding protein 


SAG0950 


82 


ribosomal protein S20 


SAG0951 


306 


pantothenate kinase 


SAG0952 


196 


conserved hypothetical protein 


SAG0953 


129 


cytidine deaminase 


SAG0954 


349 


protein of unknown function/lipoprotein, putative 


SAG0955 


511 


sugar ABC transporter, ATP-binding protein 


SAG0956 


353 


sugar ABC transporter, permease protein, putative 


SAG0957 


318 


sugar ABC transporter, permease protein, putative 


SAG0958 


456 


NADH oxidase 


SAG0959 


329 


L-lactate dehydrogenase 


SAG0960 


819 


DNA gyrase, A subunit 


SAG0961 


247 


sortase SrtA 


SAG0962 


137 


glyoxylase family protein 


SAG0963 


320 


conserved hypothetical protein 


SAG0964 


375 


Na+/H+ exchanger family protein 


SAG0965 


127 


IS1381, transposase OrfA 


SAG0966 


129 


IS 1381, transposase OrfB 


SAG0967 


520 


GMP synthase 


SAG0968 


232 


transcriptional regulator, GntR family 


SAG0969 


444 


gid protein 


SAG0970 


247 


acetyltransferase, GNAT family 


SAG0971 


282 


protein of unknown function/lipoprotein, putative 


SAG0972 


NA 


conserved hypothetical protein, authentic frameshift 


SAG0973 


320 


nisin-resistance protein, putative 


SAG0974 


250 


ABC transporter, ATP-binding protein 


SAG0975 


651 


ABC transporter, permease protein, putative 


SAG0976 


222 


DNA-binding response regulator 


SAG0977 


312 


sensor histidine kinase 


SAG0978 


356 


site-specific recombinase, phage integrase family 


SAG0979 


553 


ABC transporter, substrate-binding protein 


SAG0980 


257 


conserved hypothetical protein 


SAG0981 


228 


satD protein 


SAG0982 


521 


signal recognition particle protein Ffh 


SAG0983 


110 


conserved hypothetical protein 


SAG0984 


437 


■* • . • J • 1 * „TT 

sensor histidine kinase CiaH 


SAG0985 


226 


DNA-binding response regulator CiaR 


SAG0986 


849 


aminopeptidase N 


SAG0987 


217 


phosphate transport system regulatory protein PhoU 


SAG0988 


252 


phosphate ABC transporter, ATP-binding protein PstB, putative. 


SAG0989 


267 


phosphate ABC transporter, ATP-binding protein PstB, putative 


SAG0990 


295 


phosphate ABC transporter, permease protein PstA, putative 


SAG0991 


305 


phosphate ABC transporter, permease protein 
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SAG0992 


286 


phosphate ABC transporter, phosphate-binding protein 


SAG0993 


436 


NOLl/NOP2/sun family protein 


SAG0994 


254 


inositol monophosphatase family protein 


SAG0995 


93 


conserved hypothetical protein 


SAG0996 


137 


conserved hypothetical protein 


SAG0997 


310 


macrolide-efflux protein mreA/riboflavin biosynthesis protein 






RibF 


SAG0998 


294 


tRNA pseudouridine synthase B 


SAG0999 


143 


acetyltransferase, GNAT family 


SAG1000 


423 


conserved hypothetical protein 


SAG1001 


196 


conserved hypothetical protein 


SAG 1002 


292 


protease, putative 


SAG 1003 


876 


permease, putative 


SAG1004 


233 


ABC transporter, ATP-binding protein 


SAG1005 


706 


DNA topoisomerase I 


SAG1006 


280 


DprA/SMF protein, putative DNA processing factor 


SAG1007 


342 


iron-compound ABC transporter, iron-compound-binding protein 


SAG1008 


253 


iron compound ABC transporter, ATP-binding protein 


SAG1009 


324 


iron compound ABC transporter, permease protein 


SAG1010 


320 


iron compound ABC transporter, permease protein 


SAG1011 


182 


acetyltransferase, CysE/LacA/LpxA/NodL family 


SAG1012 


253 


ribonuclease HII 


SAG1013 


283 


GTP-binding protein 


SAG1014 


190 


conserved hypothetical protein 


SAG1015 


494 


carbon starvation protein CstA, putative 


SAG1016 


244 


response regulator 


SAG1017 


579 


sensor histidine kinase, putative 


SAG1018 


40 


lipoprotein, putative 


SAG1019 


39 


hypothetical protein 


SAG1020 


227 


lipoprotein, putative 


SAG1021 


107 


hypothetical protein 


SAG1022 


177 


hypothetical protein 


SAG1023' 


48 


hypothetical protein 


SAG1024 


183 


lipoprotein, putative 


SAG1025 


149 


hypothetical protein 


SAG1026 


NA 


immunogenic secreted protein, degenerate 


SAG1027 


84 


conserved hypothetical protein 


SAG1028 


196 


hypothetical protein 


SAG 1029 


101 


hypothetical protein 


SAG1030 


304 


protein of unknown function 


SAG 1031 


120 


conserved domain protein 


SAG1032 


85 


conserved hypothetical protein 


SAG1033 


1309 


FtsK/SpoIIIE family protein 


SAG1034 


55 


hypothetical protein 


SAG1035 


424 


conserved hypothetical protein 


SAG1036 


80 


conserved hypothetical protein 


SAG1037 


157 


hypothetical protein 


SAG1038 


1003 


phage infection protein, putative 
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SAG1039 


96 


conserved hypothetical protein 


SAG1040 


260 


conserved domain protein 


SAG 1041 


107 


hypothetical protein 


SAG1042 


1060 


carbamoyl-phosphate synthase, large subunit 


SAG1043 


358 


carbamoyl-phosphate synthase, small subunit 


SAG1044 


307 


aspartate carbamoyltransferase 


SAG1045 


430 


dihydroorotase, multifunctional complex type 


SAG1046 


209 


orotate phosphoribosyltransferase 


SAG1047 


233 


orotidine 5-phosphate decarboxylase 


SAG1048 


410 


membrane protein, putative 


SAG1049 


513 


ABC transporter, ATP-binding protein 


SAG1050 


112 


ribonucleotide reductase, truncation 


SAG1051 


358 


aspartate-semialdehyde dehydrogenase 


SAG1052 


47 


cell wall surface anchor family protein, putative 


SAG1053 


30 


hypothetical protein 


SAG1054 


531 


cardiolipin synthetase 


SAG1055 


556 


formate—tetrahydrofolate ligase 


SAG1056 


339 


lipoate-protein ligase A 


SAG1057 


292 


conserved hypothetical protein 


SAG1058 


272 


conserved hypothetical protein 


SAG1059 


110 


glycine cleavage system H protein, putative 


SAG1060 


328 


bacterial luciferase family protein 


SAG1061 


399 


oxidoreductase, FMN-binding 


SAG1062 


282 


lipoate-protein ligase A family protein 


SAG1063 


228 


flavoprotein-related protein 


SAG1064 


180 


flavoprotein family protein 


SAG1065 


190 


membrane protein, putative 


SAG1066 


572 


phosphoghxcomutase 


SAG1067 


178 


IS861, transposase OrfA 


SAG1068 


277 


IS861, transposase OrfB 


SAG1069 


65 


hypothetical protein 


SAG1070 


577 


ABC transporter, ATP-binding/permease protein 


SAG1071 


573 


ABC transporter, ATP-binding/permease protein 


SAG1072 


200 


conserved hypothetical protein 


SAG1073 


325 


conserved hypothetical protein 


SAG1074 


418 


serine hydroxymethyltransferase 


SAG1075 


183 


Sua5/YciO/YrdC/YwlC family protein 


SAG1076 


276 


modification methylase, HemK family 


SAG 1077 


359 


peptide chain release factor 1 


SAG 1078 


189 


thymidine kinases 


SAG1079 


60 


4-oxalocrotonate tautomerase 


SAG1080 


47 


hypothetical protein 


SAG1081 


312 


ApbE family protein 


SAG 1082 


200 


conserved hypothetical protein 


SAG1083 


411 


conserved hypothetical protein 


SAG 1084 


262 


formate/nitrite transporter family protein 


SAG1085 


424 


xanthine permease 


SAG1086 


193 


xanthine phosphoribosyltransferase 
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SAG1087 


327 


guanosine monophosphate reductase 


SAG1088 


446 


drug resistance transporter, EmrB/QacA family, putative 


SAG1089 


230 


conserved hypothetical protein 


SAG 1090 


666 


potassium uptake protein, putative 


SAG1091 


216 


oxidoreductase, short chain dehydrogenase/reductase family 


SAG1092 


330 


phosphate acetyltransferase 


SAG1093 


294 


ribosomal large subunit pseudouridine synthase, RluD subfamily 


SAG1094 


278 


conserved hypothetical protein 


SAG1095 


223 


GTP pyrophosphokinase family protein 


SAG1096 


190 


conserved hypothetical protein 


SAG1097 


324 


ribose-phosphate pyrophosphokinase 


SAG1098 


371 


cysteine desulphurase 


SAG1099 


115 


conserved hypothetical protein 


SAG1100 


210 


conserved hypothetical protein 


SAG1101 


226 


DNA repair protein RadC 


SAG1102 


377 


membrane protein, putative 


SAG1103 


478 


6-phospho-beta-glucosidase 


SAG1104 


204 


platelet activating factor, putative 


SAG1105 


273 


hydrolase, haloacid dehalogenase-like family 


SAG1106 


309 


transcriptional regulator, AraC family, putative 


SAG1107 


510 


voltage-gated chloride channel family protein 


SAG1108 


357 


spermidine/putrescine ABC transporter, spermidine/putrescine- 
binding protein 


SAG1109 


258 


spermidine/putrescine ABC transporter, permease protein 


SAG1110 


264 


spermidine/putrescine ABC transporter, permease protein 


SAG1 1 1 1 


384 


spermidine/putrescine ABC transporter, ATP-binding protein 


SAG1112 


300 


UDP-N-acetylenolpyruvoylglucosamine reductase 


SAG1113 


162 


2-amino-4-hydroxy-6-hydroxymethyldihydropteridine 
pyrophosphokinase 


SAG1114 


120 


dihydroneopterin aldolase 


SAG1115 


267 


dihydropteroate synthase 


SAG1116 


187 


/-imri 1111 T 

GTP cyclohydrolase I 


SAG1117 


420 


folylpolyglutamate synthase 


SAG1118 


295 


rarD protein 


SAG1119 


288 


homoserine kinase 


SAG1120 


427 


homoserine dehydrogenase 


SAG1121 


295 


polysaccharide deacetylase family protein 


SAG 1122 


515 


transporter, BCCT family protein 


SAG1123 


34 


hypothetical protein 


SAG 1124 


458 


aldehyde dehydrogenase family protein 


SAG 1125 


335 


membrane protein, putative 


SAG1126 


228 


protein of unknown function 


SAG1127 


446 


conserved domain protein 


SAG1128 


65 


transcriptional regulator, Cro/CI family 


SAG1129 


36 


hypothetical protein 


SAG1130 


49 


hypothetical protein 


SAG1131 


164 


thiol peroxidase 


SAG1132 


219 


conserved hypothetical protein 
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SAG 1133 


254 


conserved hypothetical protein 


SAG1134 


213 


transcriptional regulator, GntR family/potassioum uptake protein, 
TrkA family 


SAG1135 


183 


gls24 protein, putative 


SAG1136 


65 


conserved hypothetical protein 


SAG1137 


180 


gls24 protein, putative 


SAG1138 


64 


conserved hypothetical protein 


SAG1139 


193 


conserved hypothetical protein 


SAG1 140 


. 82 


conserved hypothetical protein 


SAG1141 


112 


conserved hypothetical protein 


SAG1142 


759 


ATP-dependent DNA helicase PcrA 


SAG1143 


128 


conserved hypothetical protein 


SAG1144 


441 


uracil permease 


SAG1145 


448 


sodium: alanine symporter family protein 


SAG1146 


411 


cation efflux family protein 


SAG1147 


130 


conserved hypothetical protein 


SAG1 148 


231 


membrane protein, putative 


SAG1149 


207 


lipoprotein, putative 


SAG1150 


400 


ribosomal protein SI 


SAG1151 


76 


conserved hypothetical protein 


SAG1152 


340 


branched-chain amino acid aminotransferase 


SAG1153 


819 


DNA topoisomerase IV, A subunit 


SAG1154 


653 


DNA topoisomerase IV, B subunit 


SAG1155 


212 


membrane protein, putative 


SAG1156 


217 


uracil-DNA glycosylase 


SAG1157 


161 


conserved hypothetical protein 


SAG1158 


413 


CMP-N-acetylneuraminic acid synthetase NeuA 


SAG1159 


209 


neuD protein 


SAG 11 60 


384 


UDP-N-acetylglucosamine-2-epimerase NeuC 


SAG1161 


341 


N-acetyl neuramic acid synthetase NeuB 


SAG1162 


466 


polysaccharide biosynthesis protein CpsL 


SAG 11 63 


318 


polysaccharide biosynthesis protein CpsK(V) 


SAG1164 


321 


glycosyl transferase CpsJ(V) 


SAG 1165 


327 


glycosyl transferase CpsO(V) 


SAG1166 


295 


glycosyl transferase CpsN(V) 


SAG 1167 


241 


polysaccharide biosynthesis protein CpsM(V) 


SAG1168 


364 


polysaccharide biosynthesis protein cpsH(V) 


SAG1169 


163 


glycosyl transferase CpsG(V) 


SAG1170 


149 


polysaccharide biosynthesis protein CpsF 


SAG1171 


462 


glycosyl transferase CpsE 


SAG 1172 


229 


cpsD protein 


SAG1173 


230 


cpsC protein 


SAG1174 


243 


capsular polysaccharide biosynthesis protein CpsB 


SAG1175 


485 


capsular polysaccharide biosynthesis protein CpsA 


SAG1176 


290 


transcriptional regulator, LysR family, putative 


SAG1177 


| 255 


conserved hypothetical protein 


SAG1178 


236 


purine nucleoside phosphorylase 


SAG 1179 


418 


voltage-gated chloride channel family protein, putative 
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SAG1180 


269 


; ~ A — r r \ ~ 

purine nucleoside phosphorylase 


SAG1181 | 


135 


arsenate reductase 


SAG 11 82 


403 


pho sphopentomutase 


SAG 11 83 


223 


ribose 5-phosphate isomerase 


SAG1184 


236 


conserved hypothetical protein 


SAG1185 


262 


tributyrin esterase 


SAG1186 


553 


metallo-beta-lactamase superfamily protein 


SAG1187 


253 


ABC transporter, ATP-binding protein 


SAG1188 


287 


ABC transporter, permease protein 


SAG1189 


334 


conserved hypothetical protein 


SAG1190 


551 


adherence and virulence protein A 


SAG1191 


239 


alpha-acetolactate decarboxylase 


SAG1192 


560 


acetolactate synthase, catabolic 


SAG1193 


408 


TPR domain protein 


SAG 11 94 


396 


membrane protein, putative 


SAG1195 


153 


MutT/nudix family protein 


SAG 11 96 


160 


mutator MutT protein 


SAG1197 


1072 


hyaluronidase 


SAG1198 


348 


dTDP-glucose 4,6-dehydratase 


SAG1199 


197 


dTDP-4-dehydrorhamnose 3 ,5-epimerase 


SAG 1200 


289 


glucose- 1 -phosphate thymidylyltransf erase 


SAG1201 


367 


iminodiacetate oxidase, putative 


SAG1202 


262 


conserved hypothetical protein 1 1CjKUU45o 


SAG1203 


227 


conserved hypothetical protein 


SAG1204 


226 


DNA replication protein DnaD, putative 


SAG1205 


172 


adenine phosphoribosyltransferase 


SAG1206 


854 


conserved domain protein 


SAG1207 


32 


hypothetical protein 


SAG1208 


732 


single-stranded-DNA-specific exonuclease RecJ 


SAG1209 


253 


oxidoreductase, short chain dehydrogenase/reductase ramny 


SAG1210 


309 


metallo-beta-lactamase superfamily protein 


SAG1211 


215 


conserved hypothetical protein 


SAG1212 


412 


GTP-bmdmg protem HtlX 


SAG1213 


296 


tRN A aena(Z)-isopentenytpyropnospnaie iransierabe 


SAG1214 


58 


hypothetical protein 


SAG1215 


305 


exfoliative toxin A, putative 


SAG1216 


1252 


pullulanase, putative 


SAG1217 


NA 


conserved hypothetical protein, autneniic iramesmit 


SAG1218 


194 


conserved hypothetical protein 


SAG1219 


468 


peptidase, M20/M25/M4U lamily 


c a ni ooa 
oAvj 1ZZU 


9 on 


n ifrnr^H 1 1 r* fpi <?ft fnm i 1 v t)ro tein 


SAG1221 


NA 


glycerophosphoryl diester phosphodiesterase, putative, authentic 
point mutation 


SAG1222 


593 


excinuclease ABC, C subunit 


SAG1223 


255 


conserved hypothetical protein 


SAG1224 


446 


MATE efflux family protem 


SAG1225 


136 


conserved hypothetical protein 


SAG1226 


165 


conserved hypothetical protein 
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SAG1227 


198 


protein of unknown function 


SAG1228 


96 


ISSdyl , transposase OrfA 


SAG1229 


259 


ISSdyl, transposase OrfB 


SAG1230 


96 


conserved hypothetical protein 


SAG1231 


NA 


transposase OrfB, IS3 family, degenerate 


SAG1232 


77 


transposase OrfB, IS3 family, truncation 


SAG1233 


822 


streptococcal histidine triad family protein 


SAG1234 


306 


laminin-binding surface protein 


SAG1235 


425 


GBSil, group II intron, maturase 


SAG1236 


NA 


C5a peptidase, authentic frameshift 


SAG1237 


444 


hypothetical protein 


SAG1238 


202 


hypothetical protein 


SAG1239 


76 


conserved hypothetical protein 


SAG1240 


125 


conserved hypothetical protein, truncation 


SAG1241 


91 


transposase OrfA, IS3 family 


SAG1242 


67 


transposase OrfB, IS3 family, truncation 


SAG 1243 


96 


ISSdyl, transposase OrfA 


SAG1244 


259 


ISSdyl, transposase OrfB 


SAG1245 


38 


hypothetical protein 


SAG1246 


389 


hypothetical protein 


SAG 1247 


399 


site-specific recombinase, phage integrase family 


SAG1248 


75 


conserved hypothetical protein 


SAG1249 


74 


transcriptional regulator, Cro/CI family 


SAG1250 


621 


Tn5252, relaxase 


SAG1251 


121 


Tn5252, Orf 9 protein 


SAG1252 


120 


Tn5252, Orf 10 protein 


SAG1253 


435 


transposase, ISL3 family 


SAG1254 


546 


mercuric reductase 


SAG1255 


130 


mercuric resistance operon regulatory protein MerR 


SAG1256 


142 


IS861, transposase OrfB, truncation 


SAG1257 


709 


cation-transporting ATPase, E1-E2 family 


SAG1258 


122 


cadmium efflux system accessory protein 


SAG1259 


99 


conserved hypothetical protein 


SAG1260 


262 


hypothetical protein 


SAG1261 


198 


conserved hypothetical protein 


SAG1262 


695 


cation-transporting ATPase, E1-E2 family 


SAG1263 


NA 


conserved domain protein, authentic frameshift 


SAG1264 


148 


transcriptional repressor CopY, putative 


SAG1265 


206 


cadmium resistance transporter, putative 


SAG1266 


152 


hypothetical protein 


SAG1267 


108 


hypothetical protein 


SAG1268 


230 


repressor protein, putative 


SAG1269 


44 


hypothetical protein 


SAG1270 


471 


ImpB/MucB/SamB family protein 


SAG1271 


116 


conserved hypothetical protein 


SAG1272 


102 


conserved hypothetical protein 


SAG1273 


118 


conserved hypothetical protein 


SAG1274 


129 


conserved hypothetical protein 
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SAG1275 


75 


hypothetical protein 


SAG1276 


358 


conserved hypothetical protein 


SAG1277 


163 


hypothetical protein 


SAG1278 


96 


hypothetical protein 


SAG1279 


99 


conserved domain protein 


SAG 1280 


2274 


SNF2 family protein 


SAG1281 


183 


hypothetical protein 


SAG1282 


63 


calcium-binding protein, putative 


SAG1283 


1631 


agglutinin receptor 


SAG1284 


196 


abortive infection protein AbiGI 


SAG1285 


281 


abortive infection protein AbiGII 


SAG 1286 


933 


Tn5252, Orf28 


SAG1287 


776 


Tn5252, Orf26 


SAG1288 


NA 


Tn5252, Orf25, degenerate 


SAG1289 


284 


Tn5252, Orf23 


SAG1290 


80 


hypothetical protein 


SAG1291 


605 


Tn5252, Orf 21 protein, internal deletion j 


SAG 1292 


162 


hypothetical protein 


SAG 1293 


194 


protease, putative 


SAG 1294 


77 


conserved hypothetical protein 


SAG1295 


127 


conserved hypothetical protein 


SAG 1296 


142 


conserved hypothetical protein 


SAG 1297 


451 


C-5 cytosine-specific DNA methylase | 


SAG1298 


31 


hypothetical protein 


SAG 1299 


272 


conserved hypothetical protein 


SAG1300 


57 


conserved hypothetical protein 


SAG 1301 


121 


ribosomal protein L7/L12 


SAG1302 


166 


ribosomal protein. LI 0 


SAG1303 


702 


ATP-dependent Clp protease, ATP-binding subunit 


SAG1304 


32 


hypothetical protein 


SAG1305 


314 


homocysteine S-methyltransferase MxnuM, putative 


SAG 1306 


458 


amino acid permease 


SAG 1307 


216 


hypothetical protein 


SAG 1308 


167 


hypothetical protein 


SAG 1309 


30 


hypothetical protein 


SAG1310 


182 


transcriptional regulator, TetR family 


SAG1311 


198 


GTP-binding protein 


SAG1312 


408 


ATP-dependent Clp protease, ATP-binding subunit ClpX 


SAG1313 


56 


conserved hypothetical protein 


SAG1314 


164 


dihydrofolate reductase 


SAG1315 


279 


thymidylate synthase 


SAG1316 


390 


HMG-CoA synthase 


SAG1317 


427 


3 -hydroxy-3 -methyl glutaryl-CoA reductase 


SAG1318 


149 


conserved hypothetical protein 


SAG1319 


214 


hemolysin III, putative 


SAG 1320 


304 


conserved hypothetical protein TIGR00147 


SAG 1321 


284 


glutathione S-transferase family protein, putative 


SAG1322 


72 


conserved domain protein 
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SAG1323 


331 


isopentenyl-diphosphate delta-isomerase 


SAG 1324 


330 


phosphomevalonate kinase 


SAG1325 


314 


diphosphomevalonate decarboxylase 


SAG 1326 


292 


mevalonate kinase, putative 


SAG1327 


409 


sensor histidine kinase 


SAG1328 


228 


DNA-binding response regulator 


SAG1329 


208 


GTP pyrophosphokinase family protein 


SAG1330 


68 


hypothetical protein 


SAG 1331 


979 


R5 protein 


SAG 1332 


146 


transcriptional regulator, MarR family, putative 


SAG1333 


690 


5'-nucleotidase family protein 


SAG1334 


136 


polypeptide deformylase, putative 


SAG1335 


449 


NADP-specific glutamate dehydrogenase 


SAG1336 


169 


membrane protein, putative 


SAG 133 7 


589 


ABC transporter, ATP-binding/permease protein 


SAG1338 


579 


ABC transporter, ATP-binding/permease protein 


SAG1339 


157 


acetyltransferase, GNAT family 


SAG 1340 


622 


ABC transporter, ATP-binding protein 


SAG1341 


402 


polyA polymerase family protein 


SAG1342 


282 


DegV family protein 


SAG1343 


126 


protein of unknown function 


SAG1344 


177 


hypothetical protein 


SAG1345 


164 


conserved hypothetical protein 


SAG1346 


654 


PTS system, fructose specific IIABC components 


SAG1347 


303 


1 -phosphofructokinase 


SAG 1348 


247 


lactose phosphotransferase system repressor 


SAG1349 


411 


beta-lactam resistance factor 


SAG1350 


544 


surface antigen-related protein 


SAG1351 


307 


2-dehydropantoate 2-reductase, putative 


SAG1352 


356 


regulatory protein, putative 


SAG1353 


330 


pyridine nucleotide-disufphide oxidoreductase family protein 


SAG1354 


251 


tRNA (guanine-Nl)-methyltransferase 


SAG1355 


172 


1 6S rRNA processing protein RimM 


SAG1356 


503 


transcriptional regulator, RofA family 


SAG1357 


80 


KH domain protein 


SAG1358 


90 


ribosomal protein S16 


SAG1359 


415 


permease, putative 


SAG1360 


236 


ABC transporter, ATP-binding protein 


SAG1361 


414 


conserved hypothetical protein 


SAG 1362 


532 


carbamoyl-phosphate synthase, large subunit, putative 


SAG1363 


356 


carbamoyl-phosphate synthase, small subunit 


SAG1364 


173 


pyrimidine operon regulatory protein 


SAG1365 


296 


ribosomal large subunit pseudouridine synthase, RluD subfamily 


SAG1366 


154 


lipoprotein signal peptidase 


SAG1367 


301 


transcriptional regulator, LysR family 


SAG1368 


94 


ribosomal protein L27 


SAG1369 


112 


conserved hypothetical protein 


SAG 1370 


104 


ribosomal protein L21 
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SAG 1371 


392 


conserved hypothetical protein 


SAG 13 72 


404 


thiamine biosynthesis protein Thil 


SAG1373 


381 


cysteine desulphurase 


SAG1374 


150 


conserved hypothetical protein 

sJE- f... 


SAG 1375 


449 


glutathione reductase 


SAG 1376 


111 


conserved hypothetical protein 


SAG1377 


388 


chorismate synthase 


SAG1378 


355 


3-dehydroquinate synthase 


SAG1379 


225 


3-dehydroquinate dehydratase 


SAG1380 


385 


conserved hypothetical protein 


SAG1381 


714 


sulfatase 


SAG1382 


119 


ribosomal protein L20 / 


SAG1383 


66 


ribosomal protein L35 


SAG1384 


176 


translation initiation factor IF-3 


SAG1385 


227 


cytidylate kinase 


SAG 13 86 


174 


conserved hypothetical protein 


SAG1387 


65 


ferredoxin, 4Fe-4S 


SAG1388 


163 


conserved hypothetical protein 


SAG1389 


406 


peptidase T 


SAG1390 


544 


polysaccharide biosynthesis protein, putative 


SAG1391 


484 


UDP-N-acetylmxiramoylalanyl-D-glutamate— 2,6-diaminopimelate 
ligase 


SAG1392 


264 


iron compound ABC transporter, ATP-binding protein 


SAG1393 


310 


iron compound ABC transporter, substrate-binding protein 


SAG1394 


341 


iron compound ABC transporter, permease protein 


SAG1395 


333 


iron compound ABC transporter, permease protein 


SAG1396 


217 


conserved hypothetical protein 

— —A- EL 


SAG 1397 


311 


inorganic pyrophosphatase, manganese-dependent 


SAG1398 


262 


pyruvate formate-lyase-activating enzyme 


SAG1399 


444 


CBS domain protein 


SAG1400 


188 


conserved hypothetical protein 


SAG1401 


311 


conserved hypothetical protein TIGR01212 
y • t. 


SAG 1402 


213 


PAP2 family protein 


SAG 1403 


194 


membrane protein, putative 


SAG1404 


308 


cell wall surface anchor family protein 


SAG1405 


294 


sortase family protein 


SAG 1406 


293 


sortase family protein 


SAG1407 


705 


cell wall surface anchor family protein 


SAG1408 


901 


cell wall surface anchor family protein 


SAG1409 


NA 


rogB protein, authentic frameshift 


SAG1410 


379 


glycosyl transferase, group 1 family protein 


SAG1411 


282 


glycosyl transferase, group 2 family protein 


SAG1412 


474 


polysaccharide biosynthesis protein 


SAG1413 


454 


membrane protein, putative 


SAG1414 


308 


glycosyl transferase, group 2 family protein 


SAG1415 


311 


glycosyl transferase, group 2 family protein 


SAG1416 


352 


nucleotide sugar dehydratase, putative 


SAG1417 


240 


nucleotidyl transferase, putative 
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SAG1418 


274 


polysaccharide biosynthesis protein, putative 


SAG1419 


577 


lipoprotein, putative 


SAG1420 


117 


conserved hypothetical protein 


SAG1421 


243 


glycosyl transferase, group 2 family protein 


SAG 1422 


313 


glycosyl transferase, group 2 family protein 


SAG1423 


384 


glycosyl transferase, putative 


SAG1424 


284 


dTDP-4-dehydrorhamnose reductase 


SAG1425 


113 


conserved hypothetical protein 


SAG1426 


369 


RNA polymerase sigma-70 factor 


SAG1427 


602 


DNA primase 


SAG1428 


125 


large conductance mechanosensitive channel protein 


SAG1429 


58 


ribosomal protein S21 


SAG1430 


167 


conserved hypothetical protein 


SAG1431 


268 


amino acid ABC transporter, amino acid-binding protein 


SAG1432 


347 


ammonium transporter family protein 


SAG1433 


375 


conserved hypothetical protein 


SAG1434 


328 


rhodanese family protein 


SAG1435 


101 


conserved hypothetical protein 


SAG1436 


457 


glycerol-3-phosphate transporter, putative 


SAG1437 


55 


hypothetical protein 


SAG1438 


754 


glycogen phosphorylase 


SAG1439 


498 


4-alpha-glucanotransferase 


SAG1440 


342 


maltose operon repressor MalR, putative 


SAG1441 


415 


maltose/maltodextrin ABC transporter, maltose/maltodextrin- 
binding protein 


SAG1442 


456 


maltose ABC transporter, permease protein 


SAG1443 


278 


maltose ABC transporter, permease protein 


SAG1444 


490 


proton/peptide symporter family protein 


SAG1445 


NA 


MutT/nudix family protein, authentic frameshift 


SAG1446 


' 62 


hypothetical protein 


SAG1447 


441 


conserved hypothetical protein 


SAG1448 


502 


glycosyl transferase, group 1 family protein 


SAG1449 


795 


preprotein translocase SecA subunit, putative 


SAG1450 


330 


conserved domain protein 


SAG1451 


494 


conserved hypothetical protein 


SAG1452 


514 


conserved hypothetical protein 


SAG1453 


409 


preprotein translocase SecY family protein 


SAG1454 


398 


glycosyl transferase, putative 


SAG 1455 


295 


glycosyl transferase, group 2 family protein 


SAG1456 


NA 


glycosyl transferase, family 8, degenerate 


SAG1457 


129 


IS 1381, transposase OrfB 


SAG1458 


127 


IS 1381, transposase OrfA 


SAG1459 


413 


glycosyl transferase family 8 


SAG1460 


401 


glycosyl transferase, family 8 


SAG1461 


335 


conserved hypothetical protein 


SAG1462 


970 


cell wall surface anchor family protein 


SAG1463 


NA 


transcriptional regulator, RofA family, authentic point mutation 


SAG 1464 


663 


excinuclease ABC, B subunit 
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SAG1465 


306 


protease, putative 


SAG1466 


727 


glutamine ABC transporter, glutamine-binding protein/permease 
protein 


SAG1467 


246 


glutamine ABC transporter, ATP-binding protein, GlnQ putative 


SAG1468 


116 


conserved hypothetical protein 


SAG1469 


52 


conserved hypothetical protein 


SAG1470 


. 437 


GTP-binding protein, GTPl/Obg family 


SAG1471 


42 


conserved hypothetical protein 


SAG1472 


413 


aminopeptidase PepS 


SAG1473 


192 


cell wall surface anchor family protein 


SAG1474 


680 


amidase family protein 


SAG1475 


240 


ribosomal small subunit pseudouridine synthase A 


SAG1476 


280 


oxidoreductase, aldo/keto reductase family 


SAG1477 


224 


nitroreductase family protein 


SAG1478 


130 


lactoylglutathione lyase 


SAG1479 


308 


glycosyl transferase, group 2 family protein 


SAG1480 


462 


amino acid permease 


SAG1481 


155 


SsrA-binding protein 


SAG1482 


801 


exoribonuclease, VacB/Rnb family 


SAG1483 


78 


preprotein translocase, SecG subunit 


SAG1484 


48 


ribosomal protein L33 


SAG1485 


389 


multi-drug resistance protein 


SAG1486 


548 


membrane protein, putative 


SAG 1487 


233 


ABC transporter, ATP binding protein 


SAG1488 


195 


dephospho-CoA kinase 


SAG 1489 


273 


formamidopyrimidine-DNA glycosylase 


SAG1490 


282 


transcriptional regulator, MutR family 


SAG1491 


530 


hypothetical protein 


SAG 1492 


58 


hypothetical protein 


SAG1493 


66 


hypothetical protein 


SAG1494 


32 


hypothetical protein 


SAG1495 


81 


CAAX amino terminal protease family protein 


SAG1496 


110 


hypothetical protein 


SAG1497 


37 


hypothetical protein 


SAG1498 


133 


hypothetical protein 


SAG1499 


299 


GTP-binding protein Era 


SAG1500 


132 


diacylglycerol kinase 


SAG1501 


161 


conserved hypothetical protein TIGR00043 


SAG1502 


268 


tetracenomycin polyketide synthesis O-methyltransferase TcmP, 
putative 


SAG 1503 


39 


hypothetical protein 


SAG 1504 


38 


hypothetical protein 


SAG1505 


158 


MutT/nudix family protein 


SAG 1506 


267 


hypothetical protein 


SAG1507 


345 


PhoH family protein 


SAG1508 


590 


67 kDa Myosin-crossreactive streptococcal antigen 


SAG1509 


71 


conserved hypothetical protein 


SAG1510 


169 


peptide methionine sulfoxide reductase 
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SAG1511 


284 


conserved hypothetical protein 


SAG1512 


185 


ribosome recycling factor 


SAG1513 


242 


uridylate kinase 


SAG1514 


226 


peptide ABC transporter, ATP-binding protein 


SAG1515 


262 


peptide ABC transporter, ATP-binding protein 


SAG1516 


255 


peptide ABC transporter, permease protein 


SAG1517 


314 


peptide ABC transporter, permease protein 


SAG1518 


538 


peptide ABC transporter, peptide-binding protein 


SAG1519 


t 229 


ribosomal protein LI 


SAG1520 


141 


ribosomal protein LI 1 


SAG1521 


388 


transposase, IS30 family, putative 


SAG1522 


460 


transporter, major facilitator family 


SAG1523 


404 


peptidase, M20/M25/M40 family 


SAG1524 


294 


transcriptional regulator, LysR family 


SAG1525 


117 


conserved hypothetical protein 


SAG1526 


178 


IS861, transposase OrfA 


SAG1527 


277 


IS861, transposase OrfB 


SAG1528 


571 


chorismate binding enzyme 


SAG1529 


816 


FtsK/SpoIIIE family protein 


SAG1530 


267 


peptidyl-prolyl cis-trans isomerase, cyclophilin-type 


SAG1531 


277 


manganese ABC transporter, permease protein 


SAG1532 


238 


manganese ABC transporter, ATP-binding protein 


SAG1533 


308 


manganese ABC transporter, manganese-binding adhesion 
liprotein 


SAG1534 


215 


iron-dependent transcriptional regulator 


SAG1535 


229 


5-methylthioadenosine nucleosidase/S-adenosylhomocysteine 
nucleosidase 


SAG 1536 


89 


conserved hypothetical protein 


SAG1537 


184 


MutT/nudix family protein 


SAG1538 


459 


UDP-N-acetylglucosamine pyrophosphorylase 


SAG1539 


31 


hypothetical protein 


SAG1540 


137 


conserved hypothetical protein 


SAG 1541 


125 


glyoxalase family protein 


SAG1542 


318 


oxidoreductase, Gfo/Idh/MocA family 


SAG1543 


NA 


conserved hypothetical protein, authentic frameshift 


SAG1544 


232 


gluconate 5-dehydrogenase, putative 


SAG1545 


78 


conserved hypothetical protein 


SAG 1546 


82 


conserved hypothetical protein 


SAG 1547 


166 


acetyltransferase, GNAT family 


SAG 1548 


422 


glycosyl transferase, group 2 family protein 


SAG1549 


127 


IS 1381, transposase OrfA 


SAG 1550 


129 


IS 1381, transposase OrfB 


SAG1551 


67 


hypothetical protein 


SAG1552 


719 


conserved hypothetical protein 


SAG1553 


477 


hypothetical protein 


SAG1554 


225 


hypothetical protein 


SAG 1555 


231 


hypothetical protein 


SAG 1556 


445 


branched-chain amino acid transport system II carrier protein 
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SAG1557 


665 


methionyl-tRNA synthetase 


SAG1558 


291 


tellurite resistance protein TehB 


SAG1559 


231 


membrane protein, putative 


SAG1560 


40 


hypothetical protein 


SAG1561 


405 


PTS system, HC component, putative 


SAG1562 


280 


conserved hypothetical protein 


SAG1563 


275 


exodeoxyribonuclease 


SAG1564 


118 


conserved hypothetical protein 


SAG1565 


158 


methylated-DNA— protein-cysteine S-methyltransferase 


SAG1566 


393 


D-isomer specific 2-hydroxyacid dehydrogenase family protein 


SAG1567 


182 


acetyltransferase, GNAT family 


SAG1568 


NA 


phosphoserine aminotransferase, authentic frameshift 


SAG1569 


211 


copper homeostasis protein CutC, putative 


SAG1570 


34 


conserved hypothetical protein 


SAG1571 


53 


hypothetical protein 


SAG1572 


287 


tetrapyrrole methylase family protein 


SAG1573 


108 


conserved hypothetical protein 


SAG 1574 


287 


DNA polymerase III, delta prime subunit, putative 


SAG1575 


211 


thymidylate kinase 


SAG 1576 


267 


transposase, IS30 family, putative, truncation 


SAG 1577 


219 


AcuB family protein 


SAG1578 


236 


branched-chain amino acid ABC transporter, ATP-binding protein 


SAG 1579 


254 


branched-chain amino acid ABC transporter, ATP-binding protein 


SAG 1580 


317 


branched-chain amino acid ABC transporter, permease protein 


SAG1581 


289 


branched-chain amino acid ABC transporter, permease protein 


SAG 1582 


388 


branched-chain amino acid ABC transporter, amino acid-binding 
protein 


SAG 1583 


81 


conserved hypothetical protein 


SAG 1584 


377 


IS1548, transposase 


SAG1585 


196 


ATP-dependent Clp protease, proteolytic subunit ClpP 


SAG 1586 


209 


uracil phosphoribosyltransferase 


SAG1587 


389 


aminotransferase, class I 


SAG1588 


182 


RNA methyltransferase, TrmH family, group 2 


SAG 1589 


450 


amino acid permease, putative 


SAG 1590 


449 


potassium uptake protein, Trk family 


SAG1591 


475 


cation uptake protein, Trk family 


SAG 1592 


83 


conserved hypothetical protein TIGR00278 


SAG 1593 


240 


ribosomal large subunit pseudouridine synthase B 


SAG1594 


194 


conserved hypothetical protein TIGR0028 1 


SAG 1595 


235 


conserved hypothetical protein 


SAG1596 


246 


integrase/recombinase, phage integrase family 


SAG 1597 


157 


CBS domain protein 


SAG 1598 


173 


conserved hypothetical protein 


SAG 1599 


324 


HAM1 protein 


SAG 1600 


264 


glutamate racemase 


SAG1601 


79 


conserved hypothetical protein 


SAG1602 


180 


membrane protein, putative 


SAG 1603 


173 


transcriptional regulator, biotin repressor family 
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SAG 1604 


229 


membrane protein, putative 


SAG1605 


167 


conserved hypothetical protein 


SAG 1606 


247 


RNA methyltransferase, TrmH family 


SAG 1607 


92 


acylphosphatase 


SAG 1608 


310 


lipoprotein, putative 


SAG 1609 


221 


amino acid ABC transporter, permease protein 


SAG1610 


285 


amino acid ABC transporter, substrate-binding protein 


SAG1611 


486 


amidase family protein 


SAG1612 


160 


transcription elongation factor GreA 


SAG1613 


600 


conserved hypothetical protein 


SAG1614 


167 


acetyltransferase, GNAT family 


SAG1615 


443 


UDP-N-acetylmuramate~alanine ligase 


SAG1616 


205 


conserved hypothetical protein 


SAG1617 


32 


hypothetical protein 


SAG1618 


1032 


Snf2 family protein 


SAG1619 


377 


IS 1548, transposase 


SAG 1620 


436, 


phosphoglycerate dehydrogenase-related protein 


SAG 1621 


300 


primosomal protein Dnal 


SAG 1622 


391 


conserved hypothetical protein 


SAG 1623 


159 


conserved hypothetical protein TIGR00244 


SAG 1624 


501 


sensor histidine kinase CsrS 


SAG 1625 


229 


DNA-binding response regulator CsrR 


SAG 1626 


177 


conserved hypothetical protein 


SAG 1627 


296 


heat shock protein HtpX 


SAG 1628 


184 


lemA protein 


SAG 1629 


237 


glucose-inhibited division protein B 


SAG 1630 


459 


sodium transport family protein 


SAG1631 


223 


potassium uptake protein, Trk family, putative 


SAG 1632 


276 


cobalt transport family protein 


SAG 1633 


558 


ABC transporter, ATP-binding protein 


SAG 1634 


212 


conserved hypothetical protein 


SAG1635 


402 


sodium:dicarboxylate symporter family protein 


SAG1636 


455 


branched-chain amino acid transport system II carrier protein 


SAG 1637 


351 


alcohol dehydrogenase, zinc-containing 


SAG 163 8 


230 


ABC transporter, permease protein 


SAG1639 


356 


ABC transporter, ATP-bmdmg protein 


SAG 1640 


458 


peptidase, M20/M25/M40 family 


SAG 1641 


274 


YaeC family protein 


SAG 1642 


277 


ABC transporter, substrate-binding protein 


SAG 1643 


229 


glutamine amidotransferase, class I 




j / 


nypoineiicai protein 


SAG1645 


238 


conserved hypothetical protein TIGR01033 


SAG1646 


32 


hypothetical protein 


SAG1647 


328 


dihydroxyacetone kinase family protein 


SAG1648 


178 


transcriptional regulator, TetR family, putative 


SAG 1649 


37 


hypothetical protein 


SAG 1650 


329 


dihydroxyacetone kinase family protein 


SAG1651 


192 


dihydroxyacetone kinase family protein 
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SACjIojz 


124 


conserved hypothetical protein 


SAG 1653 


237 


glycerol uptake facilitator protein 


SAG 1654 


134 


conserved hypothetical protein 


ct a r*^ 1 /"EC 

SAG 1 655 


237 


transcnptional regulator, MerR family 


SAG 1 656 


369 


conserved hypothetical protein 


SAG1657 


83 


hypothetical protein 


SAG 165 8 


244 


conserved hypothetical protein 


SAG1659 


118 


iojap-related protein 


SAG 1660 


173 


isochorismatase family protein 


SAG1661 


195 


conserved hypothetical protein TIGR00488 


SAG 1662 


210 


conserved hypothetical protein TIGR00482 


SAG 1663 


105 


conserved hypothetical protein TIGR00253 


SAG 1664 


372 


GTP-binding protein 


SAG 1665 


177 


hydrolase, haloacid dehalogenase-like family 


SAG 1666 


304 


membrane protein, putative 


SAG 1667 


480 


glutamyl-tRNA(Gln) amidotransferase, B subumt 


SAG 1668 


488 


glutamyl-tRNA(Gln) amidotransferase, A subumt 


ct a /"» -f /" s~ r\ 

SAG 1669 


100 


glutamyl-tRNA(Gm) amidotransferase, C subumt 


SAG 1670 


881 


pyruvate phosphate dikinase 


SAG1671 


276 


protein of unknown function 


SAG 1 672 


170 


CBS domain protein 


SAG 1 673 


321 


3 -hydroxyacyl-Co A dehydrogenase family protein 


SAG 1674 


182 


isochorismatase family protein 


SAG1675 


261 


transcnptional regulator CodY, putative 


SAG 1676 


403 


aminotransferase, class I 


SAG 1677 


150 


conserved hypothetical protein 


SAG 1678 


460 


111 11 * 1 1 1 t *1 

hydrolase, haloacid dehalogenase-like family 


SAG 1679 


320 


asparaginase family protein 


SAG 1680 


292 


shikimate 5-dehydrogenase 


SAG1681 


304 


* 1 1 i 1 f fl > « r«*i 

oxidoreductase, aldo/keto reductase family 


SAG1682 


671 


ATP-dependent DNA hehcase RecG 


SAG1683 


512 


immunogenic secreted protein, putative 


SAG 1684 


366 


alanine racemase 


o a t — 1 1 /'or 

SAG 1 685 


1 1 c\ 

1 19 


holo-(acyl-carrier-protein) synthase 


SAG 1686 


335 


phospho-2-dehydro-3 -deoxyheptonate aldolase 


SAG 1687 


842 


preprotein translocase, SecA subunit 


CI A /~* "1 ✓"O O 

SAG 1688 


315 


mannose-6-phosphate isomerase, class I 


CI A 1 ./Tift 

SAG 1 689 


293 


r* j 1_ • 

Iructokmase 


SAG 1690 


639 


T~>T*C1 a. TT A T\> /~i a 

PIS system, 11ABC components 


CI A /"I 1 J"f\ 1 

SAG 1 69 1 


479 


sucrose-6-phosphate hydrolase 






sucrose operon repressor ScrR 


SAG1693 


144 


N utilization substance protein B 


SAG1694 


129 


conserved hypothetical protein 


SAG1695 


186 


translation elongation factor P 


SAG 1696 


38 


hypothetical protein 


SAG 1697 


48 


hypothetical protein 


SAG 1698 


99 


conserved hypothetical protein 


SAG 1699 


30 


hypothetical protein 
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SAG1 /OO 


tzt 

76 


hypothetical protein 


SAG! /01 


56 


hypothetical protein 


SALrl /Uz 


41 


hypothetical protein 


SAG 1703 


54 


hypothetical protein 


SAG 1704 


1 rA 

150 


cytidme/deoxycytidylate deaminase family protein 


SAG 1705 


NA 


peptidase, M24 family, authentic point mutation 


SAO 1706 


238 


conserved hypothetical protein 


SAG 1707 


499 


drug resistance transporter, EmrB/QacA family 


SAG 1 708 


38 


hypothetical protein 


SAG 1709 


942 


excinuclease ABC, A subunit 


SAG1710 


223 


conserved hypothetical protein 


SAG1711 


314 


magnesium transporter, CorA family 


SAG1712 


79 


ribosomal protein SI 8 


SAG1713 


163 


single-strand binding protein 


SAG1714 


95 


ribosomal protein S6 


SAG1715 


374 


A/G-specific adenine glycosylase 


SAG1716 


197 


transcriptional regulator, Cro/CI family 


SAG1717 


104 


thioredoxm 


SAG1718 


166 


PAP2 family protein 


SAG1719 


TTr» 

779 


MutS2 family protein 


O A Z™ 1 1 TO A 

SAG 1720 


1 on 

180 


conserved hypothetical protein 


SAG 1721 


103 


conserved hypothetical protein 


SAG 1722 


297 


ribonuclease HUT 


SAG 1 723 


197 


signal peptidase I 


SAG 1724 


806 


helicase, putative 


SAG 1 725 


160 


conserved hypothetical protein 


O A f~~\ 1 TO 

SAG 1726 


364 


DNA-damage-inducible protein P 


SAG 1 727 


770 


formate acetyltransferase 


SAG 1728 


124 


FMN-binding protein 


SAG 1729 


309 


conserved hypothetical protein 


SAG1730 


O ^ 1 

251 


conserved hypothetical protein 


C A 1 TO 1 

SAG 1731 


ono 

298 


membrane protein, putative 


C A 1 TOO 

SAG 173 2 


o oo 

282 


glycerol uptake facilitator protein, putative 


QAP1 TO O 

SALrl 1 55 


1 CA 


universal stress protein family 


CAT1 TO/1 


A AA 
4UU 


transporter, putative 


SAljrl /30 


71 A 

219 


transcnptional regulator, Crp/Fnr family 


oACjrl /3o 


/Ol 


X-pro dipeptidyl-peptidase 


C A 1 TO T 

SAG1 /3 / 


1 1ft 

119 


hypothetical protein 


CAn TOO 

SALil 1 5o 


oo/^ 


polyprenyl synthetase family protein 


C A 1 71ft 

SAG 173 9 


oo 

582 


ABC transporter, ATP-bmdmg protein CydC 


SAOI 740 


57? 


ajjo naii&puiici, ±\ i jr-uinciing pro rem \jyQ.xJ 


SAG1741 


339 


cytochrome d ubiquinol oxidase, subunit II 


SAG1742 


475 


cytochrome d oxidase, subunit I 


SAG 1743 


402 


pyridine nucleotide-disulphide oxidoreductase family protein 


SAG1744 


299 


prenyltransferase, UbiA family 


SAG1745 


148 


hypothetical protein 


SAG1746 


35 


hypothetical protein 


SAG1747 


99 


conserved hypothetical protein TIGR00103 
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SAG1748 


396 


cyclopropane-fatty-acyl-phospholipid synthase 


SAG1749 


241 


transcriptional regulator, merR family 


SAG1750 


195 


exonuclease 


SAG1751 


178 


conserved hypothetical protein 


SAG1752 


390 


conserved hypothetical protein TIGR00275 


SAG1753 


260 


conserved hypothetical protein 


SAG1754 


89 


ribosomal protein S14 


SAG1755 


38 


hypothetical protein 


SAG 1756 


341 


conserved hypothetical protein 


SAG1757 


336 


O-sialoglycoprotein endopeptidase family protein 


SAG1758 


135 


ribosomal-protein-alanine acetyltransferase, putative 


SAG1759 


230 


protein of unknown function 


SAG 1760 


76 


conserved hypothetical protein 


SAG1761 


559 


metallo-beta-lactamase superfamily protein 


SAG1762 


169 


conserved hypothetical protein 


SAG1763 


448 


glutamine synthetase, type I 


SAG1764 


123 


transcriptional regulator GlnR 


SAG1765 


179 


conserved hypothetical protein 


SAG1766 


398 


phosphoglycerate kinase 


SAG1767 


289 


acid phosphatase 


SAG1768 


336 


glyceraldehyde 3-phosphate dehydrogenase 


SAG1769 


692 


translation elongation factor G 


SAG1770 


156 


ribosomal protein S7 


SAG1771 


137 


ribosomal protein S12 


SAG1772 


270 


pur operon repressor 


SAG1773 


313 


HD domain protein 


SAG1774 


424 


conserved hypothetical protein 


SAG1775 


210 


conserved hypothetical protein 


SAG1776 


220 


ribulose-phosphate 3-epimerase 


SAG1777 


290 


conserved hypothetical protein TIGR00157 


SAG1778 


283 


rRNA (guanine-N 1 -)-methyltransferase, putative 


SAG1779 


290 


dimethyladenosine transferase 


SAG1780 


163 


hypothetical protein 


SAG1781 


186 


primase-related protein 


SAG1782 


260 


deoxyribonuclease, TatD family 


SAG1783 


90 


hypothetical protein 


SAG1784 


130 


hypothetical protein 


SAG1785 


430 


hypothetical protein 


SAG 1786 


130 


protein of unknown function 


SAG1787 


420 


dltD protein 


SAG1788 


79 


D-alanyl carrier protein 


SAG1789 


421 


dltB protein 


SAG1790 


511 


D-alanine-activating enzyme 


SAG1791 


395 


sensor histidine kinase 


SAG1792 


224 


DNA-binding response regulator 


SAG 1793 


44 


ribosomal protein L34 


SAG 1794 


451 


membrane protein, putative 


SAG1795 


388 


transposase, IS30 family, putative 
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SAG 1796 


575 


amino acid ABC transporter, permease protein 


SAG 1 797 


407 


ammo acid ABC transporter, ATP-bmdmg protem 


SAG1798 


39 


hypothetical protein 


SAG1799 


792 


xylulose-5-phosphate/lructose-6-phosphate phosphoketolase 


SAG1 800 


363 


conserved hypothetical protem 


SAG1801 


559 


transcriptional antiterminator, BglG family 


SAG 1802 


253 


conserved hypothetical protein 


SAG1803 


505 


carbohydrate kinase, FGGY family 


SAG1804 


329 


hypothetical protein 


SAG1805 


483 


PTS system, IIC component, putative 


SAG1806 


318 


glyoxylate reductase, NADH-dependent 


SAG1807 


339 


hypothetical protein 


SAG1808 


327 


sugar binding transcriptional regulator, LacI family 


SAG1809 


215 


transaldolase family protein 


SAG1810 


238 


carbohydrate isomerase, AraD/FucA family 


SAG1811 


287 


hexulose-6-phosphate isomerase, putative 


SAG1812 


221 


hexulose-6-phosphate synthase, putative 


SAG1813 


161 


PTS system, IIA component 


SAG1814 


92 


PTS system, IIB component 


SAG1815 


479 


transport protein SgaT, putative 


SAG1816 


205 


hypothetical protein 


SAG1817 


157 


hypothetical protein 


SAG1818 


430 


adenylosuccinate synthetase 


SAG1819 


340 


perfringolysin O regulator protein 


SAG1820 


224 


conserved hypothetical protein 


SAG1821 


750 


glutamate— cysteine ligase/amino acid ligase, putative 


SAG 1822 


272 


protein of unknown function 


SAG1823 


418 


protein of unknown function 


SAG1824 


291 


chaperonin, 33 kDa 


SAG 1825 


325 


NifR3/Smml family protein 


SAG 1826 


213 


deoxynucleoside kinase family protein 


SAG1827 


163 


phosphinothricin N-acetyltransferase 


SAG 1 828 


815 


ATP-dependent Clp protease, ATP-bmdmg subunit 


SAG 1829 


154 


transcriptional regulator CtsR 


SAG1830 


1 c o 

153 


conserved hypothetical protein 


O A /—i -tool ' 

SAG1831 


346 


translation elongation factor Ts 


SAG1832 


256 


ribosomal protein S2 


SAG1833 


186 


11 111 ' 1 1j t * 1 aT** 

alkyl hydroperoxide reductase, subunit C 


SAG1834 


510 


11 "111 '1 1 i -i • i r\ 

alkyl hydroperoxide reductase, subunit F 


SAG 18.35 


134 


conserved hypothetical protein 


JSACjloio 


/C 1 

61 


conserved hypothetical protein 


SAG1837 


468 


prophage LambdaSa2, lysin, putative 


SAG1838 


109 


prophage LambdaSa2, holin, putative 


SAG1839 


136 


conserved hypothetical protein 


SAG1840 


112 


hypothetical protein 


SAG1841 


76 


conserved domain protein 


SAG1842 


1224 


prophage LambdaSa2, PblB, putative 


SAG1843 


240 


conserved hypothetical protein 
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SAG1844 


911 


conserved hypothetical protein 


SAG1845 


42 


hypothetical protein 


SAG 1846 


158 


hypothetical protein 


SAG1847 


227 


conserved hypothetical protein 


SAG1848 


114 


conserved hypothetical protein 


SAG1849 


115 


hypothetical protein 


SAG1850 


101 


hypothetical protein 


SAG1851 


111 


conserved domain protein 


SAG1852 


420 


conserved domain protein 


SAG1853 


180 


prophage LambdaSa2, protease, putative 


SAG1854 


380 


conserved hypothetical protein 


SAG 1855 


570 


prophage LambdaSa2, terminase large subunit, putative 


SAG1856 


161 


hypothetical protein 


SAG 1857 


119 


prophage LambdaSa2, KNH endonuclease family protein 


SAG 1858 


95 


hypothetical protein 


SAG1859 


180 


prophage LambdaSa2, site-specific recombinase, phage integrase 
family 


SAG1860 


154 


conserved hypothetical protein 


SAG1861 


119 


prophage LambdaSa2, transcriptional regulator, Cro/CI family 


SAG 1862 


86 


hypothetical protein 


SAG 1863 


138 


prophage LambdaSa2, single-strand binding protein 


SAG 1864 


68 


hypothetical protein 


SAG 1865 


74 


conserved hypothetical protein 


SAG1866 


109 


conserved hypothetical protein 1 


SAG 1867 


163 


conserved hypothetical protein 


SAG 1868 


134 


hypothetical protein 


SAG 1869 


437 


prophage LambdaSa2, type II DNA modification 
methyltransferase, putative 


SAG 1870 


273 


prophage LambdaSa2, DNA replication protein DnaC, putative 


SAG 1871 


248 


prophage LambdaSa2, bacteriophage replication 
protein/hypothetical protein, truncation/fusion 


SAG1872 


200 


hypothetical protein 


SAG 1873 


443 


prophage LambdaSa2, replicative DNA helicase 


SAG1874 


87 


hypothetical protein 


SAG1875 


94 


conserved hypothetical protein 


SAG1876 


176 


prophage LambdaSa2, HNH endonuclease family protein 


SAG1877 


236 


prophage LambdaSa2, antirepressor protein, putative 


SAG 1878 


102 


conserved domain protein 


SAG 1879 


156 


hypothetical protein 


SAG 1880 


54 


hypothetical protein 


SAG 1881 


51 


hypothetical protein 


SAG1882 


120 


prophage LambdaSa2, repressor protein, putative 


SAG1883 


128 


conserved hypothetical protein 


SAG 1884 


134 


hypothetical protein 


SAG1885 


356 


prophage LambdaSa2, site-specific recombinase, phage integrase 
family 


SAG1886 


32 


hypothetical protein 


SAG1887 


689 


Na+/H+ exchanger family protein 
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SAG1888 


78 


hypothetical protein 


SAG1889 


317 


microcin immunity protein MccF, putative 


SAG 1890 


631 


endopeptidase O 


SAG1891 


327 


oxidoreductase, Gfo/Idh/Moc A family 


SAG1892 


358 


membrane protein, putative 


SAG1893 


59 


hypothetical protein 


SAG1894 


214 


cyclic nucleotide-binding domain protein 


SAG1895 


204 


polypeptide deformylase 


SAG1896 


333 


sugar binding transcriptional regulator RegR 


SAG1897 


634 


conserved hypothetical protein 


SAG1898 


271 


PTS system, IID component 


SAG1899 


288 


PTS system, IIC component 


SAG1900 


164 


PTS system, IIB component 


SAG1901 


398 


glucuronyl hydrolase 


SAG1902 


144 


PTS system, IIA component ] 


SAG1903 


34 


hypothetical protein 


SAG1904 


270- 


oxidoreductase, short-chain dehydrogenase/reductase family 


SAG1905 


212 


conserved hypothetical protein 


SAG1906 


335 


carbohydrate kinase, PfkB family 


SAG1907 


212 


2-dehydro-3 -deoxyphosphogluconate aldolase/4-hydroxy-2- 
oxoglutarate aldolase 


SAG1908 


499 


hypothetical protein 


SAG1909 


204 


nitroreductase family protein 


SAG1910 


141 


transcriptional regulator, MarR family 


SAG1911 


1468 


DNA polymerase III, alpha subunit, Gram-positive type 


SAG1912 


194 


N-acetylmuramoyl-L-alanine amidase, family 4 protein 


SAG1913 


617 


prolyl-tRNA synthetase 


SAG1914 


419 


membrane-associated zinc metalloprotease, putative 


SAG1915 


264 


phosphatidate cytidylyltransferase 


SAG1916 


250 


undecaprenyl diphosphate synthase 


SAG1917 


113 


preprotein translocase, YajC subunit 


SAG1918 


114 


bacteriocin transport accessory protein, putative ! 


SAG1919 


387 


malate oxidoreductase 


SAG1920 


445 


citrate carrier protein, CCS family 


SAG1921 


508 


sensor histidine kinase 


SAG 1 922 


229 


response regulator 


SAG 1 923 


331 


UDP-glucose 4-epimerase 


SAG 1924 


535 


glucan 1,6-alpha-glucosidase 


SAG 1925 


377 


sugar ABC transporter, ATP-binding protein 


SAG 1926 


283 


helix-turn-helix domain protein, fis-type 


o/\Ajr I / 




lacX protein 


SAG1928 


325 


tagatose 1,6-diphosphate aldolase 


SAG1929 


310 


tagatose-6-phosphate kinase 


SAG1930 


171 


galactose-6-phosphate isomerase, LacB subunit 


SAG1931 


141 


galactose-6-phosphate isomerase, LacA subunit 


SAG1932 


816 


neuraminidase-related protein 


SAG1933 


482 


PTS system, IIC component, putative 


SAG1934 


101 


PTS system, IIB component, putative 
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SAG 1935 


157 


PTS system, IIA component, putative 


SAG1936 


258 


lactose phosphotransferase system repressor 


SAG1937 


NA 


streptococcal histidine triad family protein, degenerate 


SAG1938 


307 


adhesion lipoprotein 


SAG1939 


147 


protein of unknown function TIGR00256 


SAG1940 


738 


GTP pyrophosphokinase family protein 


SAG1941 


800 


X ,y -cyclic-nucleotide T -phosphodiesterase 


SAG 1942 


151 


nrdl protein 


SAG1943 


345 


conserved hypothetical protein 


SAG1944 


165 


conserved hypothetical protein 


SAG1945 


345 


iron ABC transporter, iron-binding protein 


SAG1946 


257 


DNA-binding response regulator 


SAG1947 


549 


conserved hypothetical protein 


SAG1948 


275 


PTS system, IID component 


SAG1949 


269 


PTS system, IIC component 


SAG1950 


163 


PTS system, IIB component 


SAG 1951 


141 


PTS system, IIA component, putative 


SAG1952 


353 


membrane protein, putative 


SAG1953 


60 


hypothetical protein 


SAG1954 


384 


membrane protein, putative 


SAG1955 


282 


ABC transporter, ATP-binding protein 


SAG1956 


96 


conserved hypothetical protein, truncation 


SAG1957 


250 


response regulator 


SAG1958 


276 


conserved hypothetical protein 


SAG1959 


727 


PTS system, IIABC components 


SAG1960 


551 


sensor histidine kinase 


SAG1961 


225 


phosphate regulon response regulator PhoB 


SAG1962 


218 


phosphate transport system regulatory protein PhoU, putative 


SAG1963 


253 


phosphate ABC transporter, ATP-binding protein 


SAG1964 


292 


phosphate ABC transporter, permease protein 


SAG1965 


281 


phosphate ABC transporter, permease protein 


SAG1966 


293 


hemolysin precursor, putative 


SAG1967 


195 


hypothetical protein 


SAG1968 


246 


conserved hypothetical protein TIGR00046 


SAG1969 


317 


ribosomal protein LI 1 methyltransferase 


SAG1970 


102 


conserved hypothetical protein 


SAG1971 


41 


hypothetical protein 


SAG1972 


238 


transcriptional regulator, MerR family 


SAG 1973 


156 


acetyltransferase, GNAT family 


SAG1974 


152 


MutT/nudix family protein 


SAG 1975 


47 


hypothetical protein 


SAG 1976 


156 


conserved hypothetical protein 


SAG 1977 


163 


acetyltransferase, GNAT family 


SAG1978 


422 


ATPase, AAA family 


SAG1979 


253 


membrane protein, putative 


SAG 1980 


300 


ABC transporter, ATP-binding protein 


SAG1981 


68 


hypothetical protein 


SAG 1982 


359 


transcriptional regulator, Cro/CI family 
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SAG 1983 


105 


conserved hypothetical protein 


SAG1984 


188 


conserved hypothetical protein TIGR00730 


SAG1985 


51 


hypothetical protein 


SAG1986 


375 


site-specific recombinase, phage integrase family 


SAG1987 


61 


conserved hypothetical protein 


SAG1988 


342 


conserved hypothetical protein 


SAG1989 


139 


hypothetical protein 


SAG1990 


127 


hypothetical protein 


SAG1991 


204 


transcriptional regulator, Cro/CI family 


SAG1992 


518 


protein of unknown function 


SAG1993 


373 


site-specific recombinase, phage integrase family 


SAG1994 


108 


conserved hypothetical protein 


SAG1995 


210 


hypothetical protein 


SAG1996 


263 


cell wall surface anchor family protein, putative 


SAG 1997 


182 


hypothetical protein 


SAG1998 


457 


hypothetical protein 


SAG1999 


47 


hypothetical protein 


SAG2000 


666 


membrane protein, putative 


SAG2001 


756 


conjugal transfer protein, interruption-C 


SAG2002 


129 


IS 1381, transposase OrfB 


SAG2003 


127 


IS 1381, transposase OrfA 


SAG2004 


67 


conjugal transfer protein, interruption-N 


SAG2005 


136 


conserved hypothetical protein 


SAG2006 


88 


conserved hypothetical protein 


SAG2007 


317 


conserved hypothetical protein 


SAG2008 


84 


conserved hypothetical protein 


SAG2009 


88 


conserved hypothetical protein 


SAG2010 


157 


hypothetical protein 


SAG2011 


160 


conserved hypothetical protein 


SAG2012 


90 


hypothetical protein 


SAG2013 


189 


hypothetical protein 


SAG2014 


449 


hypothetical protein 


SAG2015 


99 


transcriptional regulator, Cro/CI family j 


SAG2016 


125 


hypothetical protein 


SAG2017 


429 


transcriptional regulator, Cro/CI family 


SAG2018 


553 


FtsK/SpoIIIE family protein 


SAG2019 


153 


hypothetical protein 


SAG2020 


98 


hypothetical protein 


SAG2021 


826 


cell wall surface anchor family protein 


SAG2022 


417 


transposase, ISL3 family 


SAG2023 


546 


mercuric reductase 


SAG2024 


130 


mercuric resistance operon regulatory protein MerR 


SAG2025 


522 


Mn2+/Fe2+ transporter, NRAMP family 


SAG2026 


240 


membrane protein, putative 


SAG2027 


205 


ABC transporter, ATP-binding protein 


SAG2028 


36 


conserved hypothetical protein 


SAG2029 


284 


streptomycin resistance protein 


SAG2030 


130 


hypothetical protein 
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SAG2031 


202 


hypothetical protein 


SAG2032 


111 


conserved hypothetical protein 


SAG2033 


162 


acetyltransferase, GNAT family 


SAG2034 


247 


membrane protein, putative 


SAG2035 


300 


ABC transporter, ATP-binding protein 


SAG2036 


68 


hypothetical protein 


SAG2037 


358 


transcriptional regulator, Cro/CI family 


SAG2038 


204 


PAP2 family protein 


SAG2039 


98 


conserved hypothetical protein 


SAG2040 


186 


conserved hypothetical protein TIGR00730 


SAG2041 


287 


protease, putative 


SAG2042 


100 


rhodanese family protein 


SAG2043 


255 


cAMP factor 


SAG2044 


62 


hypothetical protein 


SAG2045 


179 


DNA topology modulation protein FlaR, putative 


SAG2046 


361 


glycerol dehydrogenase, putative 


SAG2047 


235 


conserved hypothetical protein 


SAG2048 


614 


5-methyltetrahydrofolate— homocysteine methyltransferase, 
putative 


SAG2049 


745 


5-methyltetrahydropteroyltriglutamate— homocysteine 
methyltransferase 


SAG2050 


107 


conserved hypothetical protein 


SAG2051 


230 


branched-chain amino acid transport protein AzlC, putative 


SAG2052 


41 


hypothetical protein 


SAG2053 


1570 


serine protease, subtilase family, putative 


SAG2054 


228 


DNA-binding response regulator 


SAG2055 


462 


sensor histidine kinase 


SAG2056 


202 


chromosome assembly-related protein 


SAG2057 


833 


leucyl-tRNA synthetase 


SAG2058 


415 


major facilitator family protein 


SAG2059 


281 


protein of unknown function 


SAG2060 


398 


glycosyl transferase, family 8 


SAG2061 


401 


glycosyl transferase, family 8 


SAG2062 


179 


transcription antitermination protein NusG 


SAG2063 


630 


pathogenicity protein, putative 


SAG2064 


57 


preprotein translocase, SecE subunit, putative 


SAG2065 


50 


ribosomal protein L33 


SAG2066 


773 


penicillin-binding protein 2A 


SAG2067 


294 


ribosomal large subunit pseudouridine synthase, RluD subfamily 


SAG2068 


546 


conserved hyppthetical protein 


SAG2069 


403 


phosphopentomutase 


SAG2070 


223 


deoxyribose-phosphate aldolase 


SAG2071 


400 


Na+ dependent nucleoside transporter 


SAG2072 


259 


uridine phosphorylase 


SAG2073 


245 


transcriptional regulator, GntR family 


SAG2074 


540 


60 kda chaperonin 


SAG2075 


94 


chaperonin, 10 kDa 


SAG2076 


267 


ABC transporter, ATP-binding protein 
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SAG2077 


298 


ABC transporter, permease protein 


SAG2078 


320 


protein of unknown fonction/lipoprotein, putative 


SAG2079 


265 


hydrolase, haloacid dehalogenase-like family 


SAG2080 


286 


glyoxalase family protein 


SAG2081 


243 


conserved hypothetical protein 


SAG2082 


205 


anaerobic ribonucleoside-triphosphate reductase activating protein 


SAG2083 


163 


acetyltransferase, GNAT family 


SAG2084 


310 


virulence factor MviM, putative 


SAG2085 


47 


conserved hypothetical protein 


SAG2086 


723 


anaerobic ribonucleoside-triphosphate reductase 


SAG2087 


495 


membrane protein, putative j 


SAG2088 


40 


hypothetical protein 


SAG2089 


105 


conserved hypothetical protein 


SAG2090 


136 


conserved hypothetical protein TIGR00250 


SAG2091 


88 


conserved hypothetical protein 


SAG2092 


132 


conserved hypothetical protein 


SAG2093 


379 


rec A protein 


SAG2094 


NA 


competence/damage-inducible protein CinA, authentic frameshift 


SAG2095 


183 


DNA-3-methyladenine glycos}>lase I 


SAG2096 


196 


Holliday junction DNA helicase RuvA 


SAG2097 


418 


transporter, putative 


SAG2098 


659 


DNA mismatch repair protein HexB 


SAG2099 


33 


hypothetical protein 


SAG2100 


67 


cold shock protein, CSD family 


SAG2101 


858 


DNA mismatch repair protein HexA 


SAG2102 


145 


arginine repressor ArgR, putative 


SAG2103 


563 


arginyl-tRNA synthetase 


SAG2104 


102 


conserved hypothetical protein 


SAG2105 


290 


conserved hypothetical protein 


SAG2106 


314 


conserved hypothetical protein 


SAG2107 


5.83 


aspartyl-tRNA synthetase 


SAG2108 


426 


histidyl-tRNA synthetase 


SAG2109 


60 


ribosomal protein L32 


O A 1 1 A 

SAG2110 


49 


ribosomal protein L33 


C A /^O 111 

SAGzl 1 1 


173 


conserved hypothetical protein 


SAG21 12 


494 


site-specific recombinase, phage integrase family 


SAG2113 


82 


conserved hypothetical protein 


O A /"^O 11/1 

SAG21 14 


342 


conserved hypothetical protein 


O A ✓""'O 1 1 f 

SAG2H5 


143 


hypothetical protein 


SAG2116 


151 


conserved hypothetical protein 




71 
/ 1 


hypothetical protein 


SAG2118 


306 


transcriptional regulator, Cro/CI family 


SAG2119 


373 


conserved domain protein 


SAG2120 


269 


hypothetical protein 


SAG2121 


223 


hypothetical protein 


SAG2122 


223 


DNA-binding response regulator 


SAG2123 


454 


sensor histidine kinase 


SAG2124 


517 


membrane protein, putative 
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SAG2125 


308 


carbamate kinase 


SAG2126 


332 


ornithine carbamoyltransferase 


SAG2127 


431 


sensor histidine kinase 


SAG2128 


277 


response regulator 


SAG2129 


240 


amino acid ABC transporter, ATP-binding protein 


SAG2130 


504 


amino acid ABC transporter, amino acid-binding protein/permease 
protein 


SAG2131 


847 


membrane protein, putative 


SAG2132 


247 


conserved hypothetical protein 


SAG2133 


118 


conserved hypothetical protein 


SAG2134 


772 


membrane protein, putative 


SAG2135 


179 


transcriptional regulator, TetR family, putative 


SAG2136 


98 


conserved hypothetical protein 


SAG2137 


203 


ribosomal protein S4 


SAG2138 


95 


conserved hypothetical protein 


SAG2139 


451 


replicative DNA helicase 


SAG2140 


150 


ribosomal protein L9 


SAG2141 


660 


DHH family protein 


SAG2142 


613 


glucose inhibited division protein A 


SAG2143 


203 


membrane protein, putative 


SAG2144 


373 


tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase 


SAG2145 


222 


L-serine dehydratase, iron-sulfur-dependent, beta subunit 


SAG2146 


290 


L-serine dehydratase, iron-sulfur-dependent, alpha subunit 


SAG2147 


234 


protein of unknown function/lipoprotein, putative 


SAG2148 


179 


LysM domain protein 


SAG2149 


264 


cobalt transport family protein 


SAG2150 


280 


ABC transporter, ATP-binding protein 


SAG2151 


279 


ABC transporter, ATP-binding protein 


SAG2152 


180 


CDP-diacyljglycerol-glycerol-3 -phosphate 3 - 
phosphatidyltransferase 


SAG2153 


427 


peptidase, Ml 6 family 


SAG2154 


414 


conserved hypothetical protein 


SAG2155 


117 


conserved hypothetical protein 


SAG2156 


369 


recF protein 


SAG2157 


278 


transporter, putative 


SAG2158 


220 


transcriptional regulator, Cro/CI family 


SAG2159 


493 


inosine-5-monophosphate dehydrogenase 


SAG2160 


161 


transcriptional regulator, ArgR family 


SAG2161 


226 


transcriptional regulator, Crp/Fnr family 


SAG2162 


234 


conserved hypothetical protein 




A 1 f\ 

410 


arginine deiminase 


SAG2164 


136 


acetyltransferase, GNAT family 


SAG2165 


337 


ornithine carbamoyltransferase 


SAG2166 


475 


arginine/oraithine antiporter 


SAG2167 


318 


carbamate kinase 


SAG2168 


341 


tryptophanyl-tRNA synthetase 


SAG2169 


230 


membrane protein, putative 


SAG2170 


290 


conserved hypothetical protein 



444 



WO 2004/018646 



PCT/US2003/026827 



Table 1: Complete list of GBS predicted genes 



ORF 


Size 
(a.a.) 


Annotation 


SAG2171 


539 


ABC transporter, ATP-binding protein 


SAG2172 


859 


ABC transporter, permease protein, putative 


SAG2173 


159 


conserved hypothetical protein TIGR00246 


SAG2174 


409 


serine protease 


SAG2175 


257 


partitioning protein, ParB family 



445 



WO 2004/018646 



PCT/US2003/026827 



Table 2 



ORF 


Size 
(aa) 


Signal 
Peptide 


Sortase 
motif 


Lipo- 
protein 


Other 


Western 
blot 


FACS 


GBS 
specific 


Annotation 


SAG0017 


447 


+ 














pcsB 


SAG0031 


299 


+ 














peptidase, M23/M37 family 


SAG0032 


434 


+ 








+ 


+ 




group B streptococcal surface immunogenic protein 


SAG0034 


438 


+ 










+ 




sugar ABC transporter, sugar-binding protein 


SAG0051 


126 


+ 








+ 






MORN motif family protein 


SAG0079 


212 








+ 


+ 


+ 




adenylate kinase 


SAGO086 


85 






+ 








4- 


lipoprotein, putative 


SAG0093 


250 


+ 








+ 


+ 




D-alanyl-D-alanine carboxypeptidase family protein 


SAG0094 


191 


+ 














N-acetylmuramoyl-L-alanine amidase, family 4 protein 


SAG0108 


308 


+ 














conserved hypothetical protein 


SAG0114 


322 


+ 




+ 










ribose ABC transporter, periplasmic D-ribose-binding 
protein 


SAG0124 


356 


+ 














sensor histidine kinase 


SAG0132 


294 


+ 














SPFH domain/Band 7 family protein 


SAG0134 


96 


+ 












+ 


hypothetical protein 


SAG0146 


395 


+ 














penicillin-binding protein 4, putative 


SAG0147 


411 
















D-alanyl-D-alanine carboxypeptidase family protein 


SAG0148 


551 






+ 




+ 


- 




oligopeptide ABC transporter, substrate-binding protein, 
putative 


SAG0166 


123 


+ 














conserved domain protein 


SAG0176 


94 


+ 














conserved hypothetical protein 


SAG0187 


542 


+ 








+ 


+ 




oligopeptide ABC transporter, oligopeptide-binding 
protein 


SAG0206 


60 






+ 










lipoprotein, putative 


SAG0213 


39 


+ 












+ 


hypothetical protein 


SAG0231 


135 
















hypothetical protein 


SAG0242 


308 






+ 




+ 


- 




amino acid ABC transporter, amino acid-binding protein 


SAG0245 


152 






+ 








+ 


protein of unknown function/lipoprotein, putative 


SAG0255 


315 


+ 














conserved hypothetical protein 


SAG0257 


53 






+ 








+ 


lipoprotein, putative 


SAG0265 


235 


+ 














conserved hypothetical protein 


SAG0290 


270 










+ 


+ 




ABC transporter, substrate-binding protein 


SAG0298 


750 
















penicillin-binding protein 1A 
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SAG0306 


535 


+ 














KH domain protein 


SAG0321 


339 


+ 














sensor histidine kinase, putative 


SAG0329 


106 


+ 














PTS system, cellobiose-specific IIB component 


SAG0368 


435 
















protein of unknown function 


SAG0371 


167 














+ 


hypothetical protein 


SAG0383 


334 


+ 




+ 




+ 


- 




protein of unknown function/lipoprotein, putative 


SAG0392 


521 


+ 


+ 






+ 


+ 




cell wall surface anchor family protein 


SAG0394 


345 








-f 








sensor histidine kinase 


SAG0405 


347 


+ 




+ 




+ 


+ 




protein of unknown function/lipoprotein, putative 


SAG0406 


299 


+ 














UTP-glucose- 1 -phosphate uridy ly ltransferase 


SAG0407 


338 


+ 














glycerol-3 -phosphate dehydrogenase (NAD(P)+) 


SAG0416 


1233 




+ 








+ 




protease, putative 


SAG0421 


1055 




+ 






+ 


- 




cell wall surface anchor family protein 


SAG0433 


1389 




+ 












surface protein Rib 


SAG0437 


123 






+ 










lipoprotein, putative 


SAG0451 


149 






-f 








+ 


bacteriocin transport accessory protein,putative 


SAG0455 


357 


+ 














conserved hypothetical protein 


SAG0472 


126 










+ 


- 




rhodanese-like family protein 


SAG0482 


84 
















YGGT family protein 


SAG0499 


275 








+ 








hemolysin A 


SAG0503 


279 










+ 






lipase/acylhydrolase 


SAG0504 


200 


+ 














conserved hypothetical protein 


SAG0506 


65 


+ 












+ 


hypothetical protein 


SAG0521 


236 


+ 














carboxymethylenebutenolidase-related protein 


SAG0535 


506 












+ 




zinc ABC transporter, zinc-binding adhesion liprotein 


SAG0596 


670 








+ 








prophage LambdaSal, pblA protein, internal deletion 


SAG0603 


in 








+ 








conserved hypothetical protein 


SAG0604 


239 
















prophage LambdaSal, lysin, putative 


SAG0617 


439 








+ 








sensor histidine kinase VncS 


SAG0624 


574 


+ 














septation ring formation regulator EzrA, putative 


SAG0629 


354 
















conserved domain protein 


SAG0635 


245 


+ 








+ 






acid phosphatase, class B 


SAG0638 


109 
















cell wall surface anchor family protein, interruption-N 
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Other 
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GBS 
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SAG0645 


554 




+ 






+ 


+ 




cell wall surface anchor family protein 


SAG0646 


307 


+ 


+ 








- 




cell wall surfece anchor family protein 


SAG0647 


305 


+ 














sortase family protein 


SAG0649 


890 




+ 








+ 




cell wall surface anchor family protein, putative 


SAG0658 


383 






+ 










lipoprotein, putative 


SAG0675 


171 


+ 














putative secreted protein 


SAG0676 


885 








+ 








proteinase, putative 


SAG0677 


1062 




+ 












hypothetical protein 


SAG0679 


343 


+ 




+ 




+ 


- 




protein of unknown function 


SAG0680 


339 










+ 


- 




protein of unknown function 


SAG0681 


353 
















conserved domain protein 


SAG0686 


261 


+ 








+ 






DNA-entry nuclease, putative 


SAG0714 


188 


+ 












+ 


conserved hypothetical protein 


SAG0717 


266 


+ 








+ 


+ 




amino acid ABC transporter, amino acid-binding protein 


SAG0720 


449 








+ 








sensory box histidine kinase 


SAG0738 


132 


+ 














conserved hypothetical protein 


SAG0739 


143 


+ 














conserved hypothetical protein 


SAG0742 


428 








+ 








peptidase, U32 family 


SAG0755 


282 


+ 














peptidase, U32 family 


SAG0757 


129 






+ 




+ 


- 




protein of unknown function/lipoprotein, putative 


SAG0764 


230 








+ 








phosphoglycerate mutase family protein 


SAG0765 


681 


+ 














penicillin-binding protein 2b 


SAG0771 


512 




+ 






+ 


+ 


+ 


cell wall surface anchor family protein 


SAG0776 


276 


+ 




+ 










YaeC family protein, putative 


SAG0777 


528 








+ 


+ 


+ 




ATP-dependent RNA helicase, DEAD/DEAH box family 


SAG0785 


330 


4- 














conserved hypothetical protein 


SAG0808 


309 


+ 




-f- 






+ 




protease maturation protein, putative 


SAG0824 


417 
















polysaccharide deacetylase family protein 


SAG0832 


753 


+ 








+ 


+ 




protein of unknown function 


SAG0833 


181 














+ 


hypothetical protein 


SAG0867 


63 


+ 














conserved hypothetical protein 


SAG0868 


285 


+ 








+ 






DNA-entry nuclease 


SAG0886 


319 
















protein of unknown function 



448 



WO 2004/018646 



PCT/US2003/026827 



Table 2 



ORF 


Size 
(aa) 


Signal 
Peptide 


Sortase 
motif 


Lipo- 
protein 


Other 


Western 
blot 


FACS 


GBS 
specific 


Annotation 


SAG0904 


56 


+ 












+ 


hypothetical protein 


SAG0907 


877 


+ 




+ 




+ 


- 




protein of unknown function/lipoprotein, putative 


SAG0926 


333 
















Tn916, NLP/P60 family protein 


SAG0942 


185 


+ 








+ 






signal peptidase I, putative 


SAG0949 


276 


+ 




+ 




+ 


+ 




amino acid ABC transporter, amino acid-binding protein 


SAG0954 


349 






+ 




+ 


- 




protein of unknown function/lipoprotein, putative 


SAG0961 


247 


+ 










- 




sortase SrtA 


SAG0963 


320 
















conserved hypothetical protein 


SAG0971 


282 






+ 










protein of unknown function/lipoprotein, putative 


SAG0973 


320 














+ 


nisin-resistance protein, putative 


SAG0977 


312 








+ 








sensor histtdine kinase 


SAG0979 


553 






+ 






- 




ABC transporter, substrate-binding protein 


SAG0984 


437 


+ 














sensor hist idine kinase CiaH 


SAG0992 


286 


+ 




+ 






+ 




phosphate ABC transporter, phosphate-binding protein 


SAG1007 


342 


+ 




+ 




+ 


f 




iron-compound ABC transporter, iron-compound-binding 
protein 


SAG 1014 


190 


+ 








- 


- 




conserved hypothetical protein 


SAG1018 


40 






+ 








+ 


lipoprotein, putative 


SAG 1024 


183 






+ 










lipoprotein, putative 


SAG 1029 


101 


+ 














hypothetical protein 


SAG 1030 


304 


+ 








+ 


+ 




protein of unknown function 


SAG 1037 


157 


+ 












+ 


hypothetical protein 


SAG1052 


47 














+ 


cell wall surface anchor family protein, putative 


SAG 1072 


200 


+ 














conserved hypothetical protein 


SAG 1094 


278 








+ 




+ 




conserved hypothetical protein 


SAG1108 


357 


+ 










- 




spermidine/putrescine ABC transporter, 
spermidine/putrescine-binding prot. 


SAG1121 


295 


+ 














polysaccharide deacetylase family protein 


SAG 1126 


228 


+ 










+ 




protein of unknown function 


SAG 1127 


446 
















conserved domain protein 


SAG 11 30 


49 


+ 












+ 


hypothetical protein 


SAG 1138 


64 


+ 














conserved hypothetical protein 


SAG 1139 


193 
















conserved hypothetical protein 



449 
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Annotation 


SAG 1149 


207 


+ 




+ 










lipoprotein, putative 


SAG1184 


236 


+ 














conserved hypothetical protein 


SAG1186 


553 








+ 








metallo-beta-lactamase superfamily protein 


SAG 1189 


334 
















conserved hypothetical protein 


SAG 1190 


551 








+ 








adherence and virulence protein A 


SAG1197 


1072 
















hyaluronidase 


SAG1201 


367 


+ 














iminodiacetate oxidase, putative 


SAG1206 


854 
















conserved domain protein 


SAG1214 


58 


+ 














hypothetical protein 


SAG1216 


1252 




+ 






+ 


- 




pullulanase, putative 


SAG 1227 


198 


+ 








+ 


- 




protein of unknown function 


SAG 1233 


822 


+ 








+ 


- 




streptococcal histidine triad family protein 


SAG 1234 


306 


+ 








+ 


+ 




laminin-binding surface protein 


SAG 1238 


202 


+ 














hypothetical protein 


SAG1283 


1631 










+ 


+ 




agglutinin receptor 


SAG1313 


56 


+ 














conserved hypothetical protein 


SAG1327 


409 


+ 














sensor histidine kinase 


SAG 1331 


979 


+ 


+ 






+ 






R5 protein 


SAG 1333 


690 


+ 


+ 






+ 


+ 




5-nucleotidase family protein 


SAG 1350 


544 
















surface antigen-related protein 


SAG1361 


414 


+ 














conserved hypothetical protein 


SAG1371 


392 
















conserved hypothetical protein 


SAG1393 


310 






+ 










iron compound ABC transporter, substrate-binding protein 


SAG 1404 


308 




+ 








- 




cell wall surface anchor family protein 


SAG 1405 


294 


+ 














sortase family protein 


SAG 1406 


293 


+ 














sortase family protein 


SAG 1407 


705 


+ 


+ 






+ 


+ 




cell wall surface anchor family protein 


SAG1408 


901 




+ 












cell wall surface anchor family protein 


SAG1419 


577 
















lipoprotein, putative 


SAG 1431 


268 






+ 










amino acid ABC transporter, amino acid-binding protein 


SAG1433 


375 


+ 














conserved hypothetical protein 


SAG1441 


415 


+ 










+ 




maltose/maltodexrrin ABC transporter, 
maltose/maltodextrin-binding protein 
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SAG 1462 


970 




+ 












cell wall surface anchor family protein 


SAG 1473 


192 




+ 










+ 


cell wall surface anchor family protein 


SAG 1474 


680 
















amidase family protein 


SAG1483 


78 


+ 














preprotein translocase, SecG subunit 


SAG1488 


195 










+ 






dephospho-CoA kinase 


SAG 1491 


530 














+ 


hypothetical protein 


SAG 1508 


590 








+ 


4- 


- 




67 kDa Myosin-crossreactive streptococcal antigen 


SAG1518 


538 






+ 










peptide ABC transporter, peptide-binding protein 


SAG 1530 


267 










+ 


- 




peptidyl-prolyl cis-trans isomerase, cyclophilin-type 


SAG 1533 


308 






+ 




+ 






manganese ABC transporter, manganese-binding adhesion 
liprotein 


SAG1544 


232 


+ 














gluconate 5-dehydrogenase, putative 


SAG1551 


67 


+ 












+ 


hypothetical protein 


SAG1552 


719 
















conserved hypothetical protein 


SAG1553 


477 


+ 












+ 


hypothetical protein 


SAG 1562 


280 


+ 














conserved hypothetical protein 


SAG1582 


388 


+ 




+ 






- 




branched-chain amino acid ABC transporter, amino acid- 
binding protein 


SAG1590 


449 








+ 


+ 


+ 




potassium uptake protein, Trk family 


SAG1601 


79 
















conserved hypothetical protein 


SAG1610 


285 






+ 




+ 


- 




amino acid ABC transporter, substrate-binding protein 


SAG1618 


1032 












+ 




Snf2 family protein 


SAG 1624 


501 


+ 














sensor histidine kinase CsrS 


SAG 1628 


184 


+ 














lemA protein 


SAG 1631 


223 


+ 










- 




potassium uptake protein, Trk family, putative 


SAG1641 


274 


' + 










- 




YaeC family protein 


SAG1642 


277 






+ 




+ 


- 




ABC transporter, substrate-binding protein 


SAG 1683 


512 
















immunogenic secreted protein, putative 


SAG 1706 


238 


+ 














conserved hypothetical protein 


SAG 1745 


148 


+ 












+ 


hypothetical protein 


SAG 1752 


390 


+ 














conserved hypothetical protein TIGR00275 


SAG 1759 


230 
















protein of unknown function 


SAG1762 


169 


+ 














conserved hypothetical protein 
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Western 
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GBS 
specific i 


Annotation 


SAG1767 


289 
















acid phosphatase 


SAG 1768 


336 








+ 


+ 






glyceraldehyde 3-phosphate dehydrogenase 


SAG 1774 


424 


+ 














conserved hypothetical protein 


SAG 1786 


130 


+ 








+ 


- 




protein of unknown function 


SAG 1787 


420 


+ 














dltD protein 


SAG 1791 


395 


+ 














sensor histidine kinase 


SAG1822 


272 


+ 








+ 


- 




protein of unknown function 


SAG1823 


418 








+ 


+ 


+ 




protein of unknown function 


SAG 1837 


468 








+ 








prophage LambdaSa2, lysin, putative 


SAG1838 


109 


+ 














prophage LambdaSa2, holin, putative 


SAG 1839 


136 


+ 














conserved hypothetical protein 


SAG 1842 


1224 








+ 








prophage LambdaSa2, PblB, putative 


SAG1912 


194 


+ 














N-acetylmuramoyl-L-alanine amidase, family 4 protein 


SAG1921 


508 


+ 














sensor histidine kinase 


SAG1932 


816 


+ 














neuraminidase-related protein 


SAG1938 


307 


+ 




+ 










adhesion lipoprotein 


SAG1941 


800 




+ 






+ 


- 




2 ' , 3 * -cyclic-nucleotide X -phosphodiesterase 


SAG 1945 


345 


+ 














iron ABC transporter, iron-binding protein 


SAG 1947 


549 








+ 








conserved hypothetical protein 


SAG1960 


551 








+ 


+ 


+ 




sensor histidine kinase 


SAG1966 


293 












- 




hemolysin precursor, putative 


SAG1996 


263 




+ 












cell wall surface anchor family protein, putative 


SAG 1997 


182 


+ 














hypothetical protein 


SAG 1998 


457 


+ 














hypothetical protein 


SAG2021 


















cell wall surface anchor family protein 


SAG2043 


255 


+ 














cAMP factor 


SAG2053 


157C 


+ 














serine protease, subtilase family, putative 


SAG2055 


462 








+ 








sensor histidine kinase 


SAG2056 


, 20; 


I + 












+ 


chromosome assembly-related protein 


SAG2063 


63( 


) + 


+ 












pathogenicity protein, putative 


SAG2078 


32( 


) + 








+ 






protein of unknown runction/lipoprotein, putative 


SAG2094 




+ 










+ 




competence/damage-inducible protein CinA, authentic 
frameshift 



452 
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Lipo- 
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(aa) 
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motif 


protein 


Other 


blot 


FACS 


specific 


Annotation 


SAG2121 


223 


+ 












+ 


hypothetical protein 


SAG2123 


454 
















sensor histidine kinase 


SAG214I 


660 


+ 








+ 






DHH family protein 


SAG2147 


234 






+ 




+ 


+ 




protein of unknown function/lipoprotein, putative 


SAG2148 


179 


+ 














LysM domain protein 


SAG2174 


409 


+ 














serine protease 


SAG0013 


428 


+ 














protein of unknown function 
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Table 3 



ORF 


Annotation 


SAG0038 


conserved hypothetical protein 


SAG0048 


transcriptional regulator Cro/CI family 


SAG0091 


transcriptional regulator ComXl putative 


SAGO 137 


conserved hypothetical protein 


SAG0686 


DNA-entry nuclease putative 


SAG0770 


membrane protein putative 


SAG0868 


DNA-entry nuclease 


SAG1 143 


conserved hypothetical protein 


SAG1233 


streptococcal histidine triad family protein 


SAG1596 


integrase/recombinase phage integrase family 


SAG1616 


conserved hypothetical protein 


SAG1721 


conserved hypothetical protein. 
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Table 5 



Strain 


Source 


Capsular serotype 


Reference 


090 


Lancefield 


la 




515 


Houston 


la 


(1) 


A909 


Lancefield 


la 


(2) 


Davis 


Channing 


la 




DK1 


Houston 


la 




DK8 


Houston 


la 




H36b 


Lancefield 


lb 


(2) 


(S7) 7357b 


Channing 


lb 


(3) 


18RS21 


Lancefield 


II 


(4) 


DK21 


Houston 


II 




COH1 


Seattle 


III 


(5) 


COH31 


Seattle 


in 


(6) 


D136C 


Lancefield 


in 


(4) 


M781 


Houston 


in 


(7) 


M732 


Houston 


in 


(8) 


1169NT1 


Atlanta 


V 


(9) 


2603V/R 


Italy 


V 


This study 


CJB111 


Houston 


V 


(10) 


JM9130013 


Japan 


vni 


(11) 


SMU014 


Japan 


VIII 


(11) 


CJB110 


Houston 


Nontypeable 


(12) 
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Table 6 



Cluster 1 

SAG0230 
SAG0231 
SAG0232 
SAG0233 
SAG0234 
SAG0235 



conserved hypothetical protein 
hypothetical protein 
hypothetical protein 
hypothetical protein 
hypothetical protein 
hypothetical protein 



Cluster 2 

SAG0222 conserved domain protein 

SAG0223 conserved hypothetical protein, fusion 

SAG0225 hypothetical protein 

S AG0226 recombination protein 

SAG0227 hypothetical protein 

SAG0228 conserved hypothetical protein 

SAG0229 conserved hypothetical protein 



Cluster 3 

SAG0634 hypothetical protein 

SAG0635 acid phosphatase, class B 

SAG0636 conserved hypothetical protein 

SAG0638 cell wall surface anchor family protein, interruption-N 

SAG0640 transposase OrfA, IS3 family 
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SAG0642 
SAG0643 
SAG0644 
SAG0645 
SAG0646 
SAG0647 
SAG0648 
SAG0649 
SAG0650 
SAG0651 

Cluster 4 

SAG1898 
SAG1899 
SAG1900 
SAG1901 
SAG1902 
SAG1905 
SAG1906 

Cluster 5 

SAG0247 
SAG0248 



Table 6 

hypothetical protein 
chaperonin, 33 kDa, degenerate 
transcriptional regulator, AraC family 
cell wall surface anchor family protein 
cell wall surface anchor family protein 
sortase family protein 
sortase family protein 

cell wall surface anchor family protein, putative 
sortase family protein 
protein of unknown function 



PTS system, IID component 
PTS system, IIC component 
PTS system, TLB component 
glucuronyl hydrolase 
PTS system, IIA component 
conserved hypothetical protein 
carbohydrate kinase, PfkB family 



hypothetical protein 
hypothetical protein 
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SAG0249 
SAG0674 
SAG0675 
SAG0676 
SAG0677 
SAG0680 
SAG0681 
SAG0684 
SAG1698 

Cluster 6 

SAG0261 
SAG0262 
SAG0965 
SAG0966 
SAG2002 

Cluster 7 

SAG1027 
SAG1028 
SAG1029 
SAG1030 
SAG1031 



Table 6 

hypothetical protein 

hypothetical protein 

putative secreted protein 

proteinase, putative 

hypothetical protein 

protein of unknown function 

conserved domain protein 

ABC transporter, ATP-binding protein 

conserved hypothetical protein 

IS 1 3 8 1 , transposase OrfB . 
IS1381, transposase OrfA 
IS 1381, transposase OrfA 
IS 1381, transposase OrfB 
IS 1381, transposase OrfB 



conserved hypothetical protein 
hypothetical protein 
hypothetical protein 
protein of unknown function 
conserved domain protein 
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SAG1 032 conserved hypothetical protein 
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Cluster 8 

SAG1253 
SAG1254 
SAG1255 
SAG2022 
SAG2023 
SAG2024 

Cluster 9 

SAG1993 
SAG1994 
SAG1995 
SAG1996 
SAG1997 
SAG1998 
SAG2000 
SAG2001 
SAG2007 
SAG2008 
SAG2009 
SAG2010 



transposase, ISL3 family 
mercuric reductase 

mercuric resistance operon regulatory protein MerR 
transposase, ISL3 family 
mercuric reductase 

mercuric resistance operon regulatory protein MerR 



site-specific recombinase, phage integrase family 
conserved hypothetical protein 
hypothetical protein 

cell wall surface anchor family protein, putative 

hypothetical protein 

hypothetical protein 

membrane protein, putative 

conjugal transfer protein, interruption-C 

conserved hypothetical protein 

conserved hypothetical protein 

conserved hypothetical protein 

hypothetical protein 
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Table 6 



SAG2011 


conserved hypothetical protein 


SAG2012 


hypothetical protein 


SAG2016 


hypothetical protein 


SAG2017 


transcriptional regulator, Cro/CI family 


SAG2025 


Mn2+/Fe2-f transporter, NRAMP family 


Cluster 10 




SAG1039 


conserved hypothetical protein 


SAG1447 


conserved hypothetical protein 


SAG1448 


glycosyl transferase, group 1 family protein 


SAG1449 


preprotein translocase SecA subunit, putative 


SAG1450 


conserved domain protein 


SAG1452 


conserved hypothetical protein 


SAG1453 


preprotein translocase SecY family protein 

i 


SAG1454 


glycosyl transferase, putative 


SAG1455 


glycosyl transferase, group 2 family protein 


SAG1456 


glycosyl transferase, family 8, degenerate 


SAG1459 


glycosyl transferase family 8 


£>A<jrl40U 


glycosyl transferase, family 8 


SAG1461 


conserved hypothetical protein 


SAG1462 


cell wall surface anchor family protein 


SAG1463 


transcriptional regulator, RofA family, authentic point mutation 


SAG1469 


conserved hypothetical protein 
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SAG1471 
SAG1933 

Cluster 11 

SAG0009 

SAG0120 

SAG0157 

SAG0186 

SAG0216 

SAG0236 

SAG0307 

SAG0308 

SAG0311 

SAG0518 

SAG0553 

SAG0555 

SAG0564 

SAG0579 

SAG0580 

SAG0611 

SAG0637 

SAG0641 

SAG0652 



Table 6 

conserved hypothetical protein 

PTS system, IIC component, putative 



hypothetical protein 
hypothetical protein 

deoxyribonuclease-related protein, degenerate 

hypothetical protein 

hypothetical protein 

hypothetical protein 

hypothetical protein 

ABC transporter, ATP-binding protein 

DNA-binding response regulator, authentic point mutation 

peptide chain release factor 2, programmed frameshift 

hypothetical protein 

prophage LambdaSal, antirepressor, putative 
conserved hypothetical protein 
conserved hypothetical protein 
conserved hypothetical protein, truncation 
transposase, degenerate 

transcriptional regulator, TetR family, putative, authentic frameshift 
Tn5252, Orf 10 protein, degenerate 
Tn5252, Orf 28 protein, degenerate 
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Table 6 



SAG0655 


conserved hypothetical protein 


SAG0678 


endopeptidase 0, degenerate 


SAG0683 


transmembrane protein Vexp3, putative, degenerate 


SAG0855 


glycogen biosynthesis protein GlgD, authentic frameshift 


SAG0898 


hypothetical protein 


SAG0899 


hypothetical protein 


SAG0901 


hypothetical protein 


SAG0902 


hypothetical protein 


SAG0903 


hypothetical protein 


SAG0917 


Tn916, hypothetical protein 


SAG0920 


Tn916, hypothetical protein 


SAG0922 


Tn916, hypothetical protein 


SAG0924 


Tn916, tetM leader peptide 


SAG0928 


Tn916, hypothetical protein, authentic frameshift 


SAG0936 


Tn916, hypothetical protein 


SAG0943 


hypothetical protein 


SAG0972 


conserved hypothetical protein, authentic frameshift 


SAG1023 


hypothetical protein 


Ci A 1 AO A 


nypoineiicai proiem 


SAG1123 


hypothetical protein 


SAG1129 


hypothetical protein 


SAG1136 


conserved hypothetical protein 


SAG1217 


conserved hypothetical protein, authentic frameshift 
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Table 6 



SAG1231 


transposase OrfB, IS3 family, degenerate 


SAG1242 


transposase OrfB, IS3 family, truncation 


SAG1309 


hvoothetical nrotein 


SAG1331 


R5 protein 


SAG1437 


hypothetical protein 


SAG1445 


MutT/nudix familv nrotein. authentic frameshift 


SAG1484 


ribosomal protein L33 


SAG1493 


hvoothetical nrotein 


SAG1539 


hypothetical protein 


SAG1543 


conserved hypothetical protein, authentic frameshift 


SAG1560 


hypothetical protein 


SAG1568 


phospho serine aminotransferase, authentic frameshift 


SAG1570 


conserved hypothetical protein 


SAG1601 


conserved hypothetical protein 


SAG1644 


hypothetical protein 


SAG1646 


hypothetical protein 


SAG1699 


hypothetical protein 


SAG1705 


peptidase, M24 family, authentic point mutation 


SAG1708 


hypothetical protein 


SAG1857 


prophage LambdaSa2, HNH endonuclease family protein 


SAG1864 


hypothetical protein 


SAG1868 


hypothetical protein 
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Table 6 



SAG1 869 prophage LambdaSa2, type II DNA modification methyltransferase, 
putative 

SAG1 872 hypothetical protein 

SAG1874 hypothetical protein 

S AG1 876 prophage LambdaSa2, HNH endonuclease family protein 

SAG1 878 conserved domain protein 

SAG1 88 1 hypothetical protein 

SAG1 883 conserved hypothetical protein 

S AG1 886 hypothetical protein 

SAG1 903 hypothetical protein 

SAG1937 streptococcal histidine triad family protein, degenerate 

SAG1 97 1 hypothetical protein 

S AG1 979 membrane protein, putative 

S AG1 980 ABC transporter, ATP-binding protein 

SAG1 98 1 hypothetical protein 

SAG1982 transcriptional regulator, Cro/CI family 

S AG1 983 conserved hypothetical protein 

SAG1 984 conserved hypothetical protein TIGR00730 

S AG1 985 hypothetical protein 

SAG 1991 transcriptional regulator, Cro/CI family 

SAG 1992 protein of unknown function 

SAG1 999 hypothetical protein 

SAG2004 conjugal transfer protein, interruption-N 
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SAG2039 
SAG2044 
SAG2052 
SAG2065 
SAG2094 
SAG2099 

Cluster 12 

SAG1164 
SAG1165 
SAG1166 
SAG1167 
SAG1168 

Cluster 13 

SAG0581 
SAG0582 
SAG0583 
SAG0585 
SAG0586 
SAG0587 
SAG0588 
SAG0589 



Table 6 

conserved hypothetical protein 
hypothetical protein 
hypothetical protein 
ribosomal protein L33 

competence/damage-inducible protein CinA, authentic frameshift 
hypothetical protein 



glycosyl transferase CpsJ(V) 
glycosyl transferase CpsO(V) 
glycosyl transferase CpsN(V) 
polysaccharide biosynthesis protein CpsM(V) 
polysaccharide biosynthesis protein cpsH(V) 

conserved hypothetical protein 

conserved hypothetical protein 

conserved hypothetical protein 

conserved hypothetical protein 

conserved hypothetical protein 

prophage LambdaSal, structural protein, putative 

conserved hypothetical protein 

conserved hypothetical protein 
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SAG0590 
SAG0591 
SAG0593 
SAG0594 
SAG0595 
SAG0596 

Cluster 14 

SAG0915 

SAG0918 

SAG0919 

SAG0921 

SAG0925 

SAG0926 

SAG0927 

SAG0929 

SAG0930 

SAG0931. 

SAG0932 

SAG0933 

SAG0934 

SAG0935 

SAG0937 



Table 6 

conserved hypothetical protein 
conserved hypothetical protein 
prophage LambdaSal, structural protein 
conserved hypothetical protein 
conserved hypothetical protein 

prophage LambdaSal, pblA protein, internal deletion 



Tn916, transposase 

Tn916, hypothetical protein 

Tn916, hypothetical protein 

Tn916, transcriptional regulator, putative 

Tn916, hypothetical protein 

Tn916, NLP/P60 family protein 

membrane protein, putative 

Tn916, hypothetical protein 

Tn916, hypothetical protein 

Tn916, hypothetical protein 

Tn916, transcriptional regulator, putative 

Tn916, FtsK/SpoIIIE family protein 

Tn916, hypothetical protein 

Tn916, hypothetical protein 

ABC transporter, ATP-binding protein, authentic frameshift 
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Table 6 



v^lUalcr JL j 




C A 01 
xjrWJX Ojj 


CUAloCI VCU- il^y LICILIlOLlL'Ctl piU>lClll 


oAul O J / 


piljpililgC J^cllIlUUaOci^j lyoiii, p mauve 




cuiiocrvcu. iiypouiciiccu. piuiciii 


q a ni Q/tn 


nypoineiicai protein 




propnage l^aiiiDciaoaz, jtdixj, puiaiiv© 


CAPJ 
o/VVjrioH-j 


L-UlloCI VCU liypUUlCLlUal piULCJLIl 


O-rVVJ 1 o*+*+ 


vAHioCi vcvi iiypvjuicLiva.1 pujiciii 




iiypo 1X16110 di pruicui 


CAf T 1 0^ 1 
o/AAJTiOJ I 


L/UIloCi VCU UUJJLiClIlI piuiciii 




/-•i^x-ri c^Ttr^rl HAmain n tropin 
L/U11£>CL VCU UCJillcLill piULCUl 


Q A fil Qco 
O/TlVJ 1 OJJ 


piUpilCl^C' JL^ttlIIUU«.OCl<£»5 pi LI LCaoC, pUiCttlVw 


o/\\jri o j*+ 


CUiloCl VCU IiypUlilCllCaJ. piLILClII 


kjrvvj 1 ojj 


r\rrn"\li ^ erf* T atnhHaS!;*'? fprmiTiaQp lartyp diiHiirnt ■nnfpfivp 

pi U J-Jlldi^W J—/Clt.LlLJ\X(Xi^jCl^y IW>1 lULXXlCXO V lalg& Ol4.l_SUi.lll} UlttXtJL V ^ 




fivnotVipfinal nrotftin 


uriVJIOJO 


lrvnntliptip.fil fiTYYfpiri 

11 V L/ w Ll JLw llvfll LSI VS twill 


SAG1859 


prophage LambdaSa2, site-specific recombinase, phage integrase family 


SAG1860 


conserved hypothetical protein 


SAG1861 


prophage LambdaSa2, transcriptional regulator, Cro/CI family 


SAG1862 


hypothetical protein 


SAG1863 


prophage LambdaSa2, single-strand binding protein 


SAG1865 


conserved hypothetical protein 
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Table 6 

SAG1 866 conserved hypothetical protein 
SAG1 867 conserved hypothetical protein 

SAG1 870 prophage LambdaSa2, DNA replication protein DnaC, putative 
SAG1 87 1 prophage LambdaSa2 5 bacteriophage replication protein/hypothetical 
protein, truncation/fusion 

SAG1 873 prophage LambdaSa2, replicative DNA helicase 
SAG1 877 prophage LambdaSa2, antirepressor protein, putative 
SAG1879 hypothetical protein 

SAG1 882 prophage LambdaSa2, repressor protein, putative 
SAG1 884 hypothetical protein 

SAG1885 prophage LambdaSa2, site-specific recombinase, phage integrase family 



Cluster 16 

SAG1247 site-specific recombinase, phage integrase family 

SAG1250 Tn5252, relaxase 

SAG1 25 1 Tn5252, Orf 9 protein 

SAG1252 Tn5252, Orf 10 protein 

SAG1256 IS861, transposase OrfB, truncation 

SAG1257 cation-transporting ATPase, E1-E2 family 

SAG1258 cadmium efflux system accessory protein 

SAG1259 conserved hypothetical protein 

SAG1260 hypothetical protein 

SAG 1 26 1 conserved hypothetical protein 
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Table 6 



SAG1262 


cation-transporting ATPase, E1-E2 family 


SAG1263 


conserved domain protein, authentic frameshift 


SAG1264 


transcriptional repressor CopY, putative 


SAG1265 


cadmium resistance transporter, nutative 


SAG1266 


hvDOthetical nrotein 


SAG1267 


hvnothetical nrotein 


SAG1268 


renressor nrotein nutative 


SAG1270 


ImnB/MucB/SatrLB familv orotein 


SAG1271 


conserved hvnothetical nrotein 


SAG1272 


conserved hypothetical protein 


SAG1273 


conserved hypothetical protein 


SAG1274 


cnti^ftrvftd Vivnothetifial rvrnt6*iti 

vUJlovl V vU I IV \J H A v_/ L 1 \^ CX x yJjL\J\\JLl.x 


SAG1276 


conserved hypothetical protein 


SAG1277 


hypothetical protein 


SAG1278 


hypothetical protein 


SAG1279 


conserved domain nrotein 

VV11UV1 ▼ VVi Vl^lllVVlll I/IV IVill 


SAG1280 


SNF2 family protein 


SAG1281 


hypothetical protein 


SAG1283 


agglutinin receptor 


SAG1284 


abortive infection protein AbiGI 


SAG1285 


abortive infection protein AbiGII 






SAG1287 


Tn5252, Orf26 
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Table 6 



SAG1288 


Tn5252, Orf25, degenerate 


SAG1289 


Tn5252, Orf23 


SAG1290 


hypothetical protein 


SAG1291 


Tn5252, Orf 21 protein, internal deletion 


SAG1292 


hypothetical protein 


SAG1293 


nrotease. nutative 


SAG1294 


conserved hypothetical protein 


SAG1295 


conserved hypothetical protein 


SAG1296 


conserved hypothetical protein 


SAG1297 


C-5 cytosine-specific DNA methylase 


SAG1299 


conserved hypothetical protein 


SAG1304 


hypothetical protein 



476 



WO 2004/018646 



PCT/US2003/026827 



Table 7 



Locus 


Annotation 


Housekeeping 




SAG0466 


thiolase 


SAG0471 


glucokinase 


SAG0492 


amino acid ABC transporter, ATP-binding protein 


SAG0767 


D-alanine— D-alanine ligase 


SAG1086 


xanthine phosphoribosyltransferase 


SAG1600 


glutamate racemase 


SAG1680 


shikimate 5-dehydrogenase 


SAG1723 


signal peptidase I 


Surface-exposed 




SAG0079 


adenylate kinase 


SAG0093 


D-alanyl-D-alanine carboxypeptidase family protein 


SAG0163 


competence protein CglA 


SAG0290 


ABC transporter, substrate-binding protein 


SAG0368 


protein of unknown function 


SAG0503 


lipase/acylhydrolase 


SAG1473 

UXTVJ J. *T / J 


L/vli Wall alUlcll/C allLIlUl la-HIliy piOLClIl 


SAG1552 


conserved hypothetical protein 


SAG1641 


YaeC family protein 


SAG2147 


protein of unknown function/lipoprotein, putative 


SAG2148 


LysM domain protein 
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Table 8: GBS genes shared with GAS and pneumococcus 

ORFxxxxx Annotation 

ORF0QQQ3 PcsB protein (pscB) 

ORF00004 ribose-phosphate pyrophosphokinase (prsA) 

ORFQ0005 aminotransferase, class I 

QRF00006 recombination protein O 

ORFQ0009 fatty acid/phospholipid synthesis protein PIsX (pIsX) 

QRF00011 phosphoribosylaminoimidazoie-succinocarboxamide synthase (purC) 

ORF0Q012 phosphoribosyiformylglycinamidine synthase, putative 

ORF00013 amidophosphoribosyltransferase (purF) 

QRFQQ014 phosphoribosylformylgiycinamidine cyclo-ligase (purM) 

ORFQQ015 phosphoribosylglycinamide formyltransferase (purN) 

ORF00020 group B streptococcal surface immunogenic protein 

ORF00021 N-acetyimannosamine-6-P epimerase, putative 

ORFQ0022 sugar ABC transporter, sugar-binding protein 

ORF0Q023 sugar ABC transporter, permease protein 

ORFQ0024 sugar ABC transporter, permease protein 

ORFQ0026 conserved hypothetical protein 

ORF00027 N-acetylneuraminate lyase, putative 

ORFQ0028 expressed RQK family protein 

ORF00030 phosphosugar-binding transcriptional regulator, RpiR family, putative 

ORF00031 phosphoribosylamine-glycine Hgase (purD) 

ORF00032 phosphoribosylaminoimidazole carboxylase, catalytic subunit (purE) 

ORF00033 phosphoribosylaminoimidazole carboxylase, ATPase subunit (purK) 

ORFQ0036 adenylosuccinate lyase (purB) 

ORFQ0037 transcriptional regulator, Cro/Cl family 

ORFQ0038 Hoiliday junction PNA helicase RuvB (ruvB) 

ORF00039 phosphotyrosine protein phosphatase, low molecular weight 

ORF00040 MORN motif family protein 

ORF00041 membrane protein, putative 

ORF00Q43 alcohol dehydrogenase, propanol-preferring (adhP) 

ORF00045 MATE efflux family protein 

ORF00046 ribosomal protein S10 (rpsJ) 

ORF00047 ribosomal protein L3 (rpIC) 

ORF00048 ribosomal protein L4 (rpID) 

ORF00049 ribosomal protein L23 (rplW) 

ORF00050 ribosomal protein L2 (rplB) 

QRFQ0052 ribosomal protein S19 (rpsS) 

ORF00054 ribosomal protein L22 (rpIV) 

ORF00055 ribosomal protein S3 (rpsC) 

ORF00056 ribosomal protein L16 (rpIP) 

ORF00058 ribosomal protein L29 (rpmC) 

ORF00059 ribosomal protein S17 (rpsQ) 

QRFQQ060 ribosomal protein L14 (rpIN) 

ORF00061 ribosomal protein L24 (rplX) 

QRF0Q063 ribosomal protein L5 (rplE) 

ORF00065 ribosomal protein S8 (rpsH) 

ORF00066 ribosomal protein L6 (rplF) 

QRF00068 ribosomal protein L18 (rplR) 

ORF00069 ribosomal protein S5 (rpsE) 

ORF00070 ribosomal protein L30 (rpmD) 

ORF00071 ribosomal protein L15 (rpIO) 

ORF00072 preprotein translocase, SecY subunit 

QRF00073 adenylate kinase (adk) 

ORF00074 translation Initiation factor IF-1 (infA) 

ORF00075 ribosomal protein L36 (rpmJ) 

ORF0Q077 ribosomal protein S13 (rpsM) 
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Table 8: GBS genes shared with GAS and pneumococcus 

ORFxxxxx Annotation 

ORF00Q78 ribosomal protein S1 1 (rpsK) 

ORF0Q08Q DNA-directed RNA polymerase, alpha subunit (rpoA) 

ORF0Q093 transcriptional regulator ComX1 , putative 

ORF00094 phosphoglycerate mutase family protein 

ORF00097 heat-inducible transcription repressor HrcA (hrcA) 

ORF00098 heat shock protein GrpE (grpE) 

ORF00099 dnaK protein (dnaK) 

QRF001QQ dnaJ protein (dnaJ) 

ORF00101 transcriptional regulator, GntR family 

ORF00102 tRNA pseudouridine synthase A (truA) 

ORF0Q103 phosphomethylpyrimidine kinase, putative 

ORF00104 conserved hypothetical protein 

ORF001Q5 conserved hypothetical protein 

ORF00106 conserved hypothetical protein 

ORF00107 trigger factor (tig) 

ORF00108 DNA-directed RNA polymerase, delta subunit, putative 

ORF00109 CTP synthase (pyrG) 

ORFQ01 1 1 deoxyuridine SMriphosphate nucieotidohydrolase (dut) 

ORF001 13 carbonic anhydrase-related protein 

ORF001 15 pyridine nucleotide-disulphide oxidoreductase family protein 

ORF001 16 glutamyi-tRNA synthetase (gltX) 

ORF001 19 ribose ABC transporter, ATP-binding protein (rbsA) 

ORF00122 ribose operon repressor RbsR (rbsR) 

ORF00125 ABC transporter, ATP-binding protein 

ORF00126 DNA-binding response regulator 

ORF00128 sensor histidine kinase 

ORF00131 fructose-bisphosphate aldolase (fba) 

QRF00132 L-2-hydroxyisocaproate dehydrogenase 

ORF0Q133 ribosomal protein L28 (rpmB) 

ORF00134 conserved hypothetical protein 

ORF00135 DAK2 domain protein 

ORF00136 expressed SPFH domain/Band 7 family protein 

ORF00141 amino acid ABC transporter, ATP-binding protein 

ORF0Q142 amino acid ABC transporter, amino acid-binding protein/permease protein 

ORF00143 conserved hypothetical protein 

QRF00145 undecaprenol kinase, putative . 

QRF00146 negative regulator of competence MecA, putative 

ORF00149 ABC transporter, ATP-binding protein 

ORF00150 conserved hypothetical protein 

QRF00151 selenocysteine lyase (csdB) 

QRFQ0152 NifU family protein 

ORF00153 conserved hypothetical protein 

ORFQ0155 D-alanyl-D-alanine carboxypeptidase 

ORF00158 oligopeptide ABC transporter, permease protein 

ORFQQ160 oligopeptide ABC transporter, ATP-binding protein 

ORF00161 oligopeptide ABC transporter, ATP-binding protein 

ORF00167 adc operon repressor AdcR (adcR) 

ORF00168 zinc ABC transporter, ATP-binding protein 

ORF00169 zinc ABC transporter, permease protein 

QRFQ0172 tyrosyl-tRNA synthetase (tyrS) 

ORF00173 penicillin-binding protein 1B, putative 

ORF00174 DNA-directed RNA polymerase, beta subunit (rpoB) 

ORFQ0176 DNA-directed RNA polymerase beta' subunit (rpoC) 

QRFQQ178 conserved hypothetical protein 

ORF00179 competence protein CgIA (cglA) 
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ORF0Q180 competence protein CglB (cgIB) 

ORF00181 conserved hypothetical protein 

ORF00183 conserved hypothetical protein 

ORFQ0184 acetate kinase (ackA) 

ORFQ0190 pyrroiine-5-carboxylate reductase (proC) 

ORF00191 glutamyl-aminopeptidase (pepA) 

ORF0Q198 single-strand binding protein (ssb) 

ORF00211 PTS system, IIABC components 

ORFQ0212 alpha amylase family protein 

QRF00214 transcriptional antiterminator, BgIG family 

ORF00219 PTS system, HC component, putative 

ORF00224 ribosomal protein S15 (rpsO) 

ORF00225 polyribonucleotide nucleotidyltransferase (pnp) 

ORF00227 serine Oacetyltransferase (cysE) 

ORF0Q229 cysteinyl-tRNA synthetase (cysS) 

ORF00230 conserved hypothetical protein 

ORF0Q231 RNA methyltransferase, TrmH family, group 3 

QRFQQ233 DegV family protein 

ORF00236 ribosomal protein L13 (rplM) 

ORF00237 ribosomal protein S9 (rpsl) 

QRFQQ261 transcriptional regulator MutR family 

ORF00262 transporter, putative 

ORF00263 amino acid ABC transporter, permease protein 

ORF00264 amino acid ABC transporter, amino acid-binding protein 

ORF00265 amino acid ABC transporter, permease protein 

ORF00266 amino acid ABC transporter, ATP-binding protein 

ORF0Q295 N-acetyiglucosamine-6-phosphate deacetylase (nagA) 

ORF00296 conserved hypothetical protein 

ORF0Q297 glycyl-tRNA synthetase, alpha subunit (glyQ) 

ORF00299 glycyl-tRNA synthetase, beta subunit (glyS) 

ORF00300 conserved hypothetical protein 

QRFQ0302 glycerol kinase (glpK) 

ORF00303 alpha-glycerophosphate oxidase . 

ORF00304 glycerol uptake facilitator protein (glpF) 

ORF00306 conserved hypothetical protein 

ORFQ0307 transketolase (tkt) 

ORFQQ309 ABC transporter, ATP-binding protein 

ORF00310 membrane protein, putative 

ORF00313 PTS system, NBC components 

QRF0Q314 glutamate 5-kinase (proB) 

ORF00315 gamma-glutamyl phosphate reductase (proA) 

QRF00316 conserved hypothetical protein TIGR00Q06 

QRF00318 penicillin-binding protein 2X (pbpX) 

ORF00319 phospho-N-acetylmuramoyl-pentapeptide-transferase (mraY) 

QRF0032Q ATP-dependent RNA heticase, DEAD/DEAH box family 

ORF00321 ABC transporter, substrate-binding protein 

ORF0Q322 amino acid ABC transporter, permease protein 

ORF00323 amino acid ABC transporter, ATP-binding protein 

QRF00325 thioredoxin reductase (trxB) 

ORF00326 conserved hypothetical protein 

ORF00327 NAD synthetase (nadE) 

ORF00328 aminopeptidase C (pepC) 

ORF00329 penlclllln-blnding protein 1A(pbp1A) 

QRFQ033Q recombination protein U (recU) 

ORF00331 conserved hypothetical protein 
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QRF00335 conserved hypothetical protein 

ORF0Q336 conserved hypothetical protein 

ORFQ0337 autoinducer-2 production protein LuxS (iuxS) 

QRF00338 KH domain protein 

ORF00348 guanyiate kinase (gmk) 

QRF0Q349 DNA-directed RNA polymerase, omega subunit, putative 

ORF00350 primosomal protein N' (priA) 

ORF00351 methionyl-tRNA formyltransferase (fmt) 

ORF00352 Sun protein (sun) 

ORFQ0353 serine/threonine phosphatase, putative 

ORF00354 serine/threonine protein kinase 

QRF00355 conserved hypothetical protein 

ORF00356 sensor histidine kinase, putative 

ORF00358 DNA-binding response regulator 

QRF0Q359 hydrolase, haloacid dehalogenase family/peptidyl-prolyl cis-trans isomerase, cyclophilin type 

ORF00360 general stress protein, putative 

ORFQ0361 pyruvate formate-lyase-activating enzyme (pfIA) 

ORF00362 transcriptional regulator, DeoR family 

QRF00363 transcriptional regulator, putative 

ORF00364 PTS system, cellobiose-specific HA component (ceIC) 

ORF00366 PTS system, celiobiose-specific NB component (celA) 

ORFQ0367 PTS system, cellobiose-specific liC component (ce)B) 

ORFQQ368 formate acetyitransferase (pfID) 

ORF00369 transaldolase family protein 

ORF00371 glycerol dehydrogenase (gldA) 

ORF00372 cysteine synthase A (cysK) 

ORF00373 conserved hypothetical protein TIGR00257 

ORF00374 helicase, putative 

QRF00375 competence protein F, putative 

ORF00376 ribosomal subunit interface protein (yfiA) 

ORF00385 enoyl-CoA hydratase/isomerase family protein 

QRF00386 transcriptional regulator, MarR family 

ORF00387 3-oxoacyl-(acyl-carrier-protein) synthase Ml (fabH) 

ORF0Q388 acyl carrier protein (acpP) 

ORF00390 enoyl-(acyl-carrier-protein) reductase II (fabK) 

ORF00391 malonyl CoA-acyl carrier protein transacylase (fabD) 

ORF00392 3-oxoacyHacyl-carrier protein] reductase (fabG) 

ORF00393 3-oxoacyKacyl-carrier-protein) synthase II (fabF) 

ORF00394 acetyl-CoA carboxylase, biotin carboxyi carrier protein (accB) 

ORF00395 (3R)-hydroxymyristoyl-(acyl-carrier-protein) dehydratase (fabZ) 

ORFQQ396 acetyl-CoA carboxylase, biotin carboxylase (accC) 

ORF00397 acetyl-CoA carboxylase, carboxyi transferase, beta subunit (accD) 

ORF00398 acetyl-CoA carboxylase, carboxyi transferase, alpha subunit (accA) 

QRF0040Q seryl-tRNA synthetase (serS) 

ORFQ0403 conserved hypothetical protein 

ORF00404 PTS system, man nose-specific HP component 

QRF00405 PTS system, roannose-specific IIC component (manM) 

ORF00406 PTS system, mannose-specific NAB components (manL) 

ORFQ0407 hydrolase, haloacid dehalogenase-like family 

ORF00410 xanthine/uracil permease family protein 

ORF00411 conserved hypothetical protein T1GR00150, putative 

QRF00412 acetyitransferase, GNAT family 

QRF00413 expressed protein of unknown function 

ORF00415 HIT family protein (hit) 

ORF00419 ABC transporter, ATP-binding protein 
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ORFQQ421 ABC transporter, permease protein 

QRF0Q422 conserved hypothetical protein ~ 
ORF00423 conserved hypothetical protein TIGR00091 

QRF00424 conserved hypothetical protein, POINT MUTATION 

ORFQQ425 N utilization substance protein A (nusA) 

ORFQ0426 conserved hypothetical protein " " 

ORFQ0427 ribosomal protein L7A family 

ORF00428 translation initiation factor IF-2 

ORFQ0429 ribosome-binding factor A (rbfA) 

ORFQ0432 copper-transporter ATPase CopA 

ORF00435 hydrolase, haloacid dehalogenase-like family 

ORF00436 DNA polymerase I (polA) ~ 

ORF00437 CoA binding domain protein 

ORF00440 PNA-binding response regulator 

ORF00441 sensor histidine kinase — — 

ORF00443 queuine tRNA-ribosyltransferase (tgt) 

ORF00444 conserved hypothetical protein 
ORF00449 glucose-6-phosphate isomerase (pgi) 

ORF00451 rhomboid family protein ~ 
ORF00452 expressed putative lipoprotein 

ORF00453 UTP-glucose-1 -phosphate uridylyltransferase (galU) 

ORF00454 glycerol-3-phosphate dehydrogenase (NAD(P)+) (gpsA) ~" 

ORF00455 ribonuclease P protein component (mpA) 

ORF00456 SpolllJ family protein __ 

ORF00458 R3H domain protein 

ORF00463 conserved hypothetical protein 

ORFQ0464 RecX protein 

ORF00465 RNA methyltransf erase, TrmA family 

ORF00470 ribonucleoside-diphosphate reductase 2, beta subunit (nrdF) 

ORF00472 ribonucleoside-diphosphate reductase 2, alpha subunit (nrdE) 
ORF00482 alcohol dehydrogenase, zinc-containing 

ORF00483 oxidoreductase, aldo/keto reductase family ~ 
ORF00484 cation efflux system protein ~~ 
ORF00485 transcriptional regulator, TetR family 

ORF00496 conserved hypothetical protein 

ORF00500 acetyltransferase, GNAT family 

ORF00501 conserved hypothetical protein ~ 

ORF00502 valyl-tRNA synthetase (valS) 

ORF0Q508 aspartate-ammonia ligase (asnA) ~ _ 

ORF0051 1 type II DNA modification methyitransferase, putative 

ORF00513 phosphopantetheine adenylyltransferase (coaD) 
ORF00515 conserved hypothetical protein 
ORFQ0519 conserved hypothetical protein 

ORF00520 conserved hypothetical protein TIGRQQ048 

ORF0Q522 ABC transporter, ATP-binding/permease protein 

ORF00523 ABC transporter, ATP-binding/permease protein 

ORF00524 anthranilate synthase component II (trpG) 

ORF00532 endonuclease III (nth) 

ORF00534 conserved hypothetical protein 

ORF00535 glucokinase (glk) 

ORF00536 expressed protein with rhodanese domain 

ORF00537 elongation factor Tu family protein 

ORF00540 UDP-N-acetylmuramoylalanlne— D-glutamate ligase (murP) 

ORF00541 UDP-N-acetylglucosamine-N-acetylmuramyl-(pentapeptide) pyrophosphoryl-undecaprenol N- 
acetylglucosamine transferase (rnurG) 



482 



WO 2004/018646 PCT/US2003/026827 
Table 8: GBS genes shared with GAS and pneumococcus 

ORFxxxxx Annotation 

ORF00542 cell division protein DivlB, putative 

ORF00544 cell division protein FtsA (ftsA) , 

ORF0Q545 cell division protein FtsZ (ftsZ) 

QRFQ0546 ylmE protein, putative 

ORF00547 ylmF protein (ylmF) 

ORF00549 ylmH protein (ylmH) 

QRFQ0550 cell division protein DivlVA, putative 

ORF00552 isoleucyl-tRNA synthetase (ileS) 

ORF00553 conserved hypothetical protein 

ORF00554 MutT/nudix family protein 

ORFQ0555 ATP-dependent Cip protease, ATP-binding subunit 

ORF00557 conserved hypothetical protein 

ORFQ0558 amino acid ABC transporter, permease protein 

ORFQ0559 amino acid ABC transporter, ATP-binding protein 

ORF00560 phosphoglucomutase/phosphomannomutase family protein 

ORF00562 methylenetetrahydrofolate dehydrogenase/methenyltetrahydrofolate cyclohydroiase (folD) 

ORF00564 exodeoxyribonuclease VII, large subunit (xseA) 

ORF0Q566 geranyltranstransferase, putative 

ORF0Q567 hemolysin A 

ORF00570 DNA repair protein RecN (recN) 

ORF00571 expressed DegV family protein 

ORF00574 DNA-binding protein HU (hup) 

ORF00576 dihydroorotate dehydrogenase A (pyrDA) 

ORF00577 beta-iactam resistance factor (fibB) 

ORFQQ578 beta-lactam resistance factor (fibA) 

ORF00579 murM protein, putative __ 

ORF00580 hydrolase, haloacid dehalogenase-like family 

ORF00581 HP domain protein 

ORF00582 conserved hypothetical protein 

ORF00583 cation-transporting ATPase, E1-E2 family 

ORF00588 cell division ABC transporter, ATP-binding protein FtsE (ftsE) 

ORF00589 cell division ABC transporter, permease protein FtsX (ftsX) 

ORF00591 metallo-beta-lactamase superfamiiy protein 

ORF00593 DNA polymerase III, epsilon subunit/ATP-dependent helicase DinG 

ORF00595 aspartate aminotransferase (aspC) 

ORF00596 asparaginyl-tRNA synthetase (asnS) 

ORF00601 conserved hypothetical protein 

ORF00602 conserved hypothetical protein 

ORFQ0603 conserved hypothetical protein 

ORF00605 zinc ABC transporter, zinc-binding adhesion liprotein 

QRF00606 ribosomal protein L31 (rpmE) 

ORF00607 DHH family protein 

ORF0Q609 flavodoxin 

ORF00614 ribosomal protein L19 (rpIS) 

ORF00640 prophage LambdaSal, single-strand binding protein (ssb) 

ORFQQ693 DNA-binding response regulator VncR (vncR) 

ORF00694 sensor histidine kinase VncS (vncS) 

ORF00699 rod shape-determining protein RodA, putativeD (rodA) 

ORF0070Q hydrolase, haloacid dehalogenase-like family 

QRF00701 DNA gyrase, B subunit (gyrB) __ 

ORF0Q702 septation ring formation regulator EzrA, putative 

ORF00705 conserved hypothetical protein 

ORF00706 enolase (eno) 

ORF00708 3-phosphoshikimate 1-carboxyvinyltransf erase (aroA) 

QRF00709 shikimate kinase (aroK) 
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ORFQ071Q psr protein 

ORF0071 1 RNA methyltransferase, TrmA family 

ORF00729 sortase family protein 

QRFQQ731 sortase family protein 

ORF00734 sortase family protein, FRAMES HI FT 

ORF00743 ABC transporter, ATP-binding protein 

QRFQ0744 membrane protein 

ORF00745 conserved hypothetical protein 

ORF00748 cylG protein (cylG) " 

ORF0Q776 DNA-entry nuclease, putative 

ORF00789 2-keto-3-deoxygluconate kinase " 

ORF00792 2-dehydro-3-deoxyphosphogluconate aldolase/4-hydroxy-2-oxoglutarate aldolase (eda) 

ORF00798 proline dipeptidase (pepQ) 

ORFQ0799 transcriptional regulator, RegM family 

ORF00802 glycosyl transferase, group 1 family protein 

ORF00803 threonyl-tRNA synthetase (thrS) 

ORF008Q4 DNA-binding response regulator 

ORF00808 amino acid ABC transporter, permease protein 

ORFQ0810 amino acid ABC transporter, ATP-binding protein 

ORF0081 1 DNA-binding response regulator 

ORF00812 sensory box histidine kinase 

ORFQ0813 metallo-beta-lactamase family protein 

QRF00815 ribonuclease 111 (rnc) 

ORFQ0816 expressed putative chromosome segregation SMC protein 

ORF00817 hydrolase, haloacid dehalogenase-like family 

ORF00818 hydrolase, haloacid dehalogenase-like family 

ORF00819 signal recognition particle-docking protein FtsY (ftsY) 

QRF00820 ABC transporter, substrate-binding protein 

ORF00821 ABC transporter, permease protein, putative 

ORF00824 transcriptional accessory protein Tex, putative 

QRFQ0825 conserved hypothetical protein 

ORF00828 HPr(Ser) kinase/phosphatase (hprK) 

ORF0Q830 prolipoprotein diacylglyceryl transferase (Igt) 

ORF00832 conserved hypothetical protein 

ORF0Q835 peptidase, U32 family, putative 

ORF00836 peptidase, U32 family 

ORF00837 conserved hypothetical protein 

ORF00844 lysyl-tRNA synthetase (lysS) 

ORF0Q846 phosphoglycerate mutase family protein 

QRFQ0847 ebsC family protein, putative 

ORFQ0850 peptidase, U32 family 

ORF00855 oligoendopeptidase F, putative 

ORFQ0856 phosphoenolpyruvate carboxylase (ppc) 

QRF00859 cell division protein, FtsW/RodA/SpoVE family (ftsW) 

QRF00861 translation elongation factor Tu (tuf) 

ORFQ0863 triosephosphate isomerase (tpiA) 

ORF00865 phosphoglycerate mutase (gpmA) 

ORF00867 recombination protein RecR (recR) 

QRFQ0868 D-alanine-D-alanine ligase _____ 

ORF00869 UDP-N-acetyImuramoylalanyl-D-glutamyl-2,6-diaminopimelate-D-alanyl-D-alanyl ligase (murF) 

ORF00870 oxalate:formate antiporter 

ORFQ0871 membrane protein, putative , 

ORF00873 peptide chain release factor 3 (prfC) 

ORFQ0876 ABC transporter, ATP-binding protein 

QRFQ0880 ATP-dependent RNA helicase, DEAD/DEAH box family 
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ORF0Q882 conserved hypothetical protein 

ORF0Q883 conserved hypothetical protein 

ORF00884 acyltransferase family protein 

ORFQ0885 competence protein CelA (ceiA) 

ORF00887 DNA internalization-related competence protein ComEC/Rec2 

ORF00889 sugar-binding transcriptional regulator, Lacl famiiy 

QRFQ0892 DNA polymerase 111, delta subunit, putativeD 

ORF00893 superoxide dismutase, Fe-Mn (sodA) 

ORF00894 transcriptional antiterminator LicT 

ORF00895 PTS system, beta-glucosides-specific HABC components 

ORF0Q896 6-phospho-beta-glucosidase (bglA) 

ORF00899 glycerate kinase 2 (garK) 

ORF00904 S-adenosylmethionine:tRNA ribosyltransferase-isomerase (queA) 

ORF009Q6 glucosamine-6-phosphate isomerase (nagB) 

ORF00908 ribosomal small subunit pseudouridine synthase 

QRFQ091 1 competence protein CoiA (coiA) 

ORFQ0912 oligoendopeptidase B (pepB) 

ORFQ0914 O-methyltransferase family protein 

QRFQ0916 protease maturation protein, putative 

ORF00919 alanyl-tRNA synthetase (alaS) 

ORF00925 transcriptional regulator, Cro/CI family 

ORF00928 ribonucleoside-diphosphate reductase 2, beta subunit (nrdF) 

ORFQ0929 ribonucleoside-diphosphate reductase 2, alpha subunit (nrdE) 

QRF00930 ribonucleoside-diphosphate reductase 2, NrdH-redoxin (nrdH) 

ORF0Q931 phosphocarrier protein HPr (ptsH) 

ORF00932 phosphoenolpyruvate-protein phosphotransferase (ptsl) 

ORF00933 glyceraldehyde-3-phosphate dehydrogenase, NADP-dependent (gapN) 

ORF00934 polysaccharide deacetylase family protein 

ORF00935 ATP-dependent RNA helicase, DEAD/DEAH box family 

QRF00936 uridine kinase (udk) 

ORF00937 conserved hypothetical protein 

QRF00938 DNA polymerase 111, gamma and tau subunits (dnaX) 

ORF00940 biotin— acetyl-CoA-carboxylase ligase 

QRF00941 S-adenosylmethionine synthetase (metK) 

ORF00955 UDP-N-acetylglucosamine 1 -carboxyvinyltransferase (murA) 

ORFQ0956 acetyltransferase, GNAT family 

ORF00957 CBS domain protein 

ORFQ0958 methionine aminopeptidase, type I (map) 

ORF00959 ribonuciease BN, putative 

QRF00962 conserved hypothetical protein 

ORF00963 DNA ligase, NAD-dependent (ligA) 

ORF00964 BmrU protein, putative 

QRFQ0966 pullufanase, putative 

QRF00973 ATP synthase FQ, A subunit (atpB) 

QRF00974 ATP synthase FQ, B subunit (atpF) 

ORF00975 ATP synthase F1 , delta subunit (atpH) 

ORF00976 ATP synthase F1 , alpha subunit (atpA) 

ORFQ0977 ATP synthase F1, gamma subunit (atpG) 

ORF00978 ATP synthase F1 , beta subunit (atpD) 

ORFQ0979 ATP synthase F1, epsilon subunit (atpC) 

ORF00981 UDP-N-acetylglucosamine 1 -carboxyvinyltransferase (murA) 

ORFQ0983 DNA-entry nuclease (endA) 

ORF00984 phenylalanyl-tRNA synthetase, alpha subunit (pheS) _ 

ORF0Q986 phenylalanyl-tRNA synthetase, beta subunit (pheT) 

ORFQQ988 exonuclease RexB (rexB) 
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QRFQQ989 exonuclease RexA (rexA) 

ORF0Q991 tRNA modification GTPase TrmE (trmE) 

ORFQ0992 ABC transporter, ATP-binding protein 

ORF0Q993 acetoin dehydrogenase, thymine PPi dependent, E1 component, alpha subunit 

QRF00994 acetoin dehydrogenase, thymine PPi dependent, E1 component, beta subunit 

QRF0Q995 acetoin dehydrogenase, thymine PPi dependent, E2 component, dihydrolipoaroide 

QRF00996 acetoin dehydrogenase, thymine PPi dependent, E3 component, dihydrolipoamide dehydrogenase 

ORF00997 lipoate-protein ligase A (IpiA) 

QRF00998 cobyric acid synthase, putative 

QRF0Q999 mur ligase family protein 

ORF01000 conserved hypothetical protein TIGRQQ159 

QRF01001 expressed protein of unknown function 

ORF01002 phosphoglucomutase/phosphomannomutase family protein 

QRF01005 oxygen-independent coproporphyrinogen HI oxidase, putative 

ORF01006 conserved hypothetical protein 

ORF01007 hydrolase, haloacid dehalogenase-like family 

ORF01008 conserved hypothetical protein 

ORF01023 GTP-binding protein LepA (lepA) 

QRF01027 PilB-related protein 

ORF01030 cation-transporting ATPase, E1-E2 family 

ORFQ1Q33 conserved hypothetical protein 

QRF01Q40 Tn916, tetracycline resistance protein (tetM) 

QRF01057 transcriptional regulator, GntR family 

ORF01058 DNA polymerase 111, alpha subunit (dnaE) 

ORF01059 6-phosphofructokinase (pfk) 

ORF01060 pyruvate kinase (pyk) 

QRF01063 glucosamine--fructose-6-phosphate aminotransferase (isomerizing) (glmS) 

QRF01066 phnA protein (phnA) 

ORF01068 amino acid ABC transporter, permease protein 

ORF01069 amino acid ABC transporter, ATP-binding protein 

QRF01070 amino acid ABC transporter, amino acid-binding protein 

ORF01072 ribosomal protein S20 (rpsT) 

ORF01073 pantothenate kinase (coaA) 

ORF01074 conserved hypothetical protein 

ORF01075 cytidine deaminase (cdd) 

ORF01076 expressed putative lipoprotein 

ORF01077 sugar ABC transporter, ATP-binding protein 

ORFQ1078 sugar ABC transporter, permease protein, putative 

ORF01079 sugar ABC transporter, permease protein, putative 

ORF01080 NADH oxidase (nox-2) 

ORF01081 L-lactate dehydrogenase (Idh) 

ORF01082 DNA gyrase, A subunit (gyrA) 

ORF01083 sortase SrtA (srtA) 

ORF01089 GMP synthase (guaA) 

ORF01090 transcriptional regulator, GntR family 

QRF01091 gid protein (gid) 

ORF01093 expressed putative lipoprotein 

ORFQ1097 ABC transporter, ATP-binding protein 

ORF01099 DNA-binding response regulator , 

ORF01101 site-specific recombinase, phage integrase family 

ORFQ1106 signal recognition particle protein Ffh (ffh) 

ORF01 108 conserved hypothetical protein 

6RF01 109 sensor hlstldlne kinase ClaH 

ORF0111Q DNA-binding response regulator CiaR (ciaR) 

ORF01111 aminopeptidase N (pepN) 
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ORF01 112 phosphate transport system regulatory protein PhoU (pholl) 

ORF01 1 13 phosphate ABC transporter, ATP-binding protein PstB, putative 

ORF011 14 phosphate ABC transporter, ATP-binding protein PstB, putative 

QRF011 15 phosphate ABC transporter, permease protein PstA, putative 

ORF01 116 phosphate ABC transporter, permease protein 

ORF01 117 phosphate ABC transporter, phosphate-binding protein 

ORF01 118 NOL1/NOP2/sun family protein 

ORF01 119 inositol monophosphatase family protein 

ORFQ1120 conserved hypothetical protein 

ORFQ1121 conserved hypothetical protein 

ORF01122 macrolide-efflux protein mreA/riboflavin biosynthesis protein RibF 

ORF01 123 tRNA pseudouridine synthase B (truB) 

ORF01125 conserved hypothetical protein 

ORF01128 permease, putative 

QRFQ1129 ABC transporter, ATP-binding protein 

ORF01131 DNA topoisomerase I (topA) 

ORF01 132 DprA/SMF protein, putative DNA processing factor (dprA) 

ORF01 134 iron compound ABC transporter, ATP-binding protein 

ORFQ1 1 37 acetyltransferase, CysE/LacA/LpxA/NodL family 

ORF01138 ribonuclease Hll (rnhB) 

ORF01139 GTP-binding protein 

ORFQ1176 carbamoyl-phosphate synthase, large subunit (carB) 

ORF01177 carbamoyl-phosphate synthase, small subunit (carA) 

ORF01178 aspartate carbamoyltransferase (pyrB) 

ORF01179 dihydroorotase, multifunctional complex type (pyrC) 

ORFQ1180 orotate phosphoribosyltransferase (pyrE) 

ORF01181 orotidine S'-phosphate decarboxylase (pyrF) 

ORF01183 ABC transporter, ATP-binding protein 

ORF01184 ribonucleotide reductase, truncation 

ORF01188 cardiolipin synthetase (els) 

ORFQ1189 formate-tetrahydrofolate ligase (fhs) 

ORF01 190 lipoate-protein ligase A (IplA) 

ORF01198 flavoprotein-related protein 

QRF01 199 flavoprotein family protein 

QRF012QQ membrane protein, putative 

ORF01201 phosphoglucomutase (pgm) 

ORF01203 1S861, transposase QrfB 

ORF01205 ABC transporter, ATP-binding/permease protein 

ORF012Q6 ABC transporter, ATP-binding/permease protein 

QRF01207 conserved hypothetical protein 

ORF01208 conserved hypothetical protein , 

ORF01 209 Serine hydroxymethyltransferase 

ORF01210 Sua5/YciO/YrdC/YwlC family protein 

ORF01211 modification methylase, HemK family 

ORF01212 peptide chain release factor 1 (prfA) 

ORF01213 thymidine kinases (tdk) 

ORF01214 4-oxalocrotonate tautomerase (xylM) 

ORFQ1216 ApbE family protein 

ORF01220 xanthine permease (pbuX) 

ORF01221 xanthine phosphoribosyltransferase (xpt) 

ORF01222 guanosine monophosphate reductase (guaC) 

ORF01227 phosphate acetyltransferase 

ORF01228 rlbosomal large subunit pseudouridine synthase, RluD subfamily 

ORFQ1229 expressed protein of unknown function 

ORF01230 GTP pyrophosphokinase family protein 
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ORF01231 conserved hypothetical protein 

ORF01232 rlbose-phosphate pyrophosphokinase (prsA) 

ORFQ1233 cysteine desulphurase (iscS) 

ORF01234 conserved hypothetical protein 

ORF01235 conserved hypothetical protein 

ORF01236 DNA repair protein RadC (radC) 

QRF01238 6-phospho-beta-glucosidase (ascB) 

ORF01239 platelet activating factor, putative 

ORF01240 hydrolase, haloacid dehalogenase-like family 

QRF01242 voltage-gated chloride channel family protein 

ORF01243 spermidine/putrescine ABC transporter, sperm idine/putrescine-binding protein (potD) 

ORF01244 spermidine/putrescine ABC transporter, permease protein (potC) 

ORF01245 spermidine/putrescine ABC transporter, permease protein (potB) 

ORF01246 spermidine/putrescine ABC transporter, ATP-binding protein (potA) 
QRF01247 UDP-N-acetylenolpyruvoylglucosamine reductase (murB) 

ORF01248 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine pyrophosphokinase (folK) 

ORF01250 dihydropteroate synthase (folP) 

ORF01251 GTP cyclohydrolase I (folE) 

ORF01252 folylpolyglutamate synthase (folC) 

ORF01259 aldehyde dehydrogenase family protein 

ORF0126Q membrane protein 

ORF01274 gls24 protein, putative 

QRFQ1276 gls24 protein, putative 

ORF01279 conserved hypothetical protein 

ORF01282 ATP-dependent DNA helicase PcrA (pcrA) 

ORFQ1283 conserved hypothetical protein, FRAMESHIFT 

ORF01284 uracil permease (uraA) 

ORF01285 sodium :alanine symporter family protein 

ORF01286 cation efflux family protein 

ORF01290 ribosomal protein S1 (rpsA) 

ORF01292 branched-chain amino acid aminotransferase (ilvE) 

ORF01294 DNA topoisomerase IV, A subunit (parC) 

ORF01295 DNA topoisomerase IV, B subunit (parE) 

ORF01296 membrane protein, putative 

ORF01297 uracil-DNA glycosylase (ung) 

ORF01317 transcriptional regulator, LysR family, putative 

ORF01319 purine nucleoside phosphorylase (deoD) 

ORFQ1321 purine nucleoside phosphorylase (deoD) 

QRF01323 phosphopentomutase (deoB) 

ORF01324 ribose 5-phosphate isomerase (rpiA) 

QRF01327 tributyrin esterase (estA) 

ORF01328 metallo-beta-lactamase superfamily protein 

QRF01329 ABC transporter, ATP-binding protein 

ORF01330 ABC transporter, permease protein 

QRFQ1331 conserved hypothetical protein 

ORF01332 adherence and virulence protein A (pavA) 

ORF01335 TPR domain protein 

QRF01336 membrane protein 

ORF01338 mutator MutT protein (mutX) 

ORFQ1339 hyaluronidase , 

ORF01343 iminodiacetate oxidase, putative 

ORFQ1344 conserved hypothetical protein TIGR00486 

ORF0134S conserved hypothetical protein 

QRF01346 DNA replication protein Dnad, putative 

ORF01347 adenine phosphoribosyltransferase (apt) 
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QRF01350 single-stranded-DNA-specific exonuclease RecJ (recJ) 

ORF01351 oxidoreductase, short chain dehydrogenase/reductase family 

ORF01352 metallo-beta-lactamase superfamily protein 

ORF01353 conserved hypothetical protein 

ORF01354 GTP-binding protein HflX (hflX) 

ORF01355 tRNA delta(2)-isopentenylpyrophosphate transferase (miaA) 

ORF01 357 exfoliative toxin A, putative __ 

ORF01358 pullulanase, putative 

QRFQ1362 conserved hypothetical protein 

QRF01363 peptidase, M20/M25/M4Q family 

ORF01364 nitroreductase family protein 

QRF01367 excinuclease ABC, C subunit (uvrC) 

ORF01380 streptococcal histidine triad family protein 

ORF01381 laminin-binding surface protein (Imb) 

ORF01397 Tn5252, relaxase 

ORFQ1403 mercuric reductase (merA) 

ORF01406 1S861, transposase OrfB, truncation 

ORF01407 cation-transporting ATPase, E1-E2 family 

ORF0141 1 conserved hypothetical protein 

ORF01412 cation-transporting ATPase, E1-E2 family 

ORF01415 transcriptional repressor CopY, putative 

ORF01416 cadmium resistance transporter, putative 

ORF01451 C-5 cytosine-specific PNA methylase 

ORF01453 conserved hypothetical protein 

QRF01455 ribosomal protein L7/L12 (rplL) __ 

ORF01456 ribosomal protein L10 (rplJ) 

ORF01458 ATP-dependent Cip protease, ATP-binding subunit 

ORFQ1467 GTP-binding protein (cgpA) 

ORF01468 ATP-dependent Cip protease, ATP-binding subunit CIpX (dpX) 

ORF0147Q dihydrofolate reductase (folA) 

ORF01471 thymidylate synthase (thyA) 

ORF01472 HMG-CoA synthase 

ORF01473 3-hydroxy-3-methylglutaryl-CoA reductase 

QRF01474 conserved hypothetical protein 

QRF01475 hemolysin 111, putative 

ORF01476 conserved hypothetical protein T1GRQ0147 

ORF01479 isopentenyl-diphosphate delta-isomerase 

ORF0148Q phosphomevalonate kinase 

ORF01481 diphosphomevalonate decarboxylase (mvaD) 

ORF01482 mevalonate kinase, putative 

QRF01484 DNA-binding response regulator 

ORF01491 polypeptide deformylase, putative 

ORF01495 ABC transporter, ATP-binding/permease protein 

ORF01496 ABC transporter, ATP-binding/permease protein 

ORF01498 ABC transporter, ATP-binding protein 

QRF01499 polyA polymerase family protein 

ORF01500 DegV family protein 

ORF01501 expressed protein of unknown function 

ORF01504 PTS system, fructose specific HABC components 

ORF015Q5 1-phosphofructokinase (fruK) 

ORF01506 lactose phosphotransferase system repressor (lacR) 

ORF01 507 beta-lactam resistance factor 

ORF01511 pyridine nucleotlde-disulphide oxidoreductase family protein 

ORF01512 tRNA (guanine-NI)-methyltransferase (trmD) 

ORF01513 16S rRNA processing protein RimM (rimM) 
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ORF01 51 5 transcriptional regulator, RofA family 

ORFQ1516 KH domain protein 

ORF01517 ribosomal protein S16 (rpsP) 

ORF01518 permease, putative 

ORF01519 ABC transporter, ATP-binding protein 

ORF0152Q conserved hypothetical protein 

ORF01523 carbamoyl-phosphate synthase, small subunit (carA) 

ORF01524 pyrimidine operon regulatory protein (pyrR) 

ORF01525 ribosomal large subunit pseudouridine synthase, RluP subfamily 

ORF01526 lipoprotein signal peptidase (IspA) 

QRFQ1527 transcriptional regulator, LysR family 

ORF01528 ribosomal protein L27 (rpmA) 

QRF01 529 conserved hypothetical protein 

ORF0153Q ribosomal protein L21 (rplU) 

ORF01531 conserved hypothetical protein, FRAMESHIFT 

QRF01532 thiamine biosynthesis protein Thil (thil) 

QRFQ1533 cysteine desulphurase (iscS) 

ORF01536 glutathione reductase (gor) 

ORFQ1537 conserved hypothetical protein 

QRF01538 chorismate synthase (aroC) 

QRF01539 3-dehydroquinate synthase (aroB) 

ORF01540 3-dehydroquinate dehydratase (aroP) 

ORF01541 conserved hypothetical protein 

ORF01543 ribosomal protein L20 (rpIT) 

QRF01544 ribosomal protein L35 (rpml) 

QRF01545 translation initiation factor IF-3 (infC) 

ORF01546 cytidylate kinase (cmk) 

ORF01548 ferredoxin, 4Fe-4S 

ORFQ1550 peptidase t (pepT) 

ORF01551 polysaccharide biosynthesis protein, putative 

ORF01552 UDP-N>acetylmuramoylalanyl-D-glutamate-2,6-diaminopimelate ligase (murE) 

ORF01553 iron compound ABC transporter, ATP-binding protein (fepC) 

ORFQ1555 iron compound ABC transporter, permease protein 

ORF01556 iron compound ABC transporter, permease protein 

ORF01558 inorganic pyrophosphatase, manganese-dependent (ppa) 

ORF01559 pyruvate formate-lyase-activating enzyme (pfIA) 

ORF01560 CBS domain protein 

ORFQ1561 conserved hypothetical protein 

ORF01564 PAP2 family protein 

ORF01565 membrane protein, putative 

QRFQ1567 expressed sortase family protein 

ORF01568 sortase family protein 

ORF01571 rogB protein FRAMESHIFT (rogB) 

ORF01587 conserved hypothetical protein 

ORF01589 RNA polymerase sigma-70 factor (rpoD) 

QRF01590 DNA primase (dnaG) 

ORF01591 large conductance mechanosensitive channel protein (mscL) 

ORF01592 ribosomal protein S21 (rpsU) 

QRF01594 amino acid ABC transporter, amino acid-binding protein 

QRF01598 rhodanese family protein 

ORF01602 glycogen phosphorylase (glgP) 

ORF01603 4-_alpha-glucanotransferase (malQ) 

ORF01804 maltose operon repressor MaIR, putative 

QRFQ1605 maltose/maltodextrin ABC transporter, maltose/maltodextrin-binding protein 

ORFQ16Q6 maltose ABC transporter, permease protein 
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QRF01607 maltose ABC transporter, permease protein 

QRF01614 preprotein translocase SecA subunit, putative 

ORF01619 preprotein translocase SecY family protein 

ORF01634 excinuclease ABC, B subunit (uvrB) 

ORF01636 glutamine ABC transporter, glutamine-binding protein/permease protein (glnP) 

ORF01637 glutamine ABC transporter, ATP-binding protein, GlnQ putative 

ORF01640 GTP-binding protein, GTP1/Obg family (obg) 

ORF01646 amidase family protein ; _ 

ORF01647 ribosomal small subunit pseudouridine synthase A (rsuA) 

ORF01648 oxidoreductase, aldo/keto reductase family _ 

ORF01651 lactoylglutathione lyase (gloA) 

ORF01652 glycosyl transferase, group 2 family protein 

ORF01654 SsrA-binding protein (smpB) — 

ORF01655 exoribonuclease, VacB/Rnb family (vacB) 

ORF01657 preprotein translocase, SecG subunit 

ORF01658 multi-drug resistance protein 

ORFQ1662 dephospho-CoA kinase - 

ORF01663 formamidopyrimidine-DNA glycosylase (mutM) 

QRF01677 GTP-binding protein Era (era) — 

ORFQ1678 diacylglycerol kinase (dgkA) 

ORF01679 conserved hypothetical protein T1GR00043 

ORF01685 PhoH family protein _ 

ORF01687 conserved hypothetical protein 

ORF01689 conserved hypothetical protein 

QRF01690 ribosome recycling factor (frr) 

ORF01691 uridylate kinase (pyrH) m 

ORF01693 peptide ABC transporter, ATP-binding protein FRAMESHIFT 

ORF01697 ribosomal protein L1 (rplA) 

ORF01698 ribosomal protein L11 (rplK) 

ORFQ1706 1S861, transposase OrfB 

ORF01707 chorismate binding enzyme 

ORF01708 FtsK/SpolllE family protein _ 

ORF01709 peptidyl-prolyl cis-trans isomerase, cyclophilin-type 

ORF01710 manganese ABC transporter, permease protein . 

ORF0171 1 manganese ABC transporter, ATP-binding protein _ 

ORF01712 manganese ABC transporter, manganese-binding adhesion liprotein 

ORF01713 iron-dependent transcriptional regulator . 

ORF01714 5-methylthioadenosine nucleosidase/S-adenosylhomocysteine nucleosidase (pfs) 

ORF01716 MutT/nudix family protein 

ORF01718 UDP-N-acetylglucosamine pyrophosphorylase (glmli) 

ORF01722 oxidoreductase, Gfo/ldh/MocA family 

QRF01725 gluconate 5-dehydrogenase, putative 

ORF01726 conserved hypothetical protein 

ORFQ1738 branched-chain amino acid transport system II carrier protein (brnQ) 

ORFQ1739 methionyl-tRNA synthetase (metG) . 

ORF01745 exodeoxyribonuclease (exoA) 

ORF01746 conserved hypothetical protein 

QRF01752 copper homeostasis protein CutC, putative 

ORF01755 tetrapyrrole methylase family protein , 

ORF01756 conserved hypothetical protein _ 

ORF01758 DNA polymerase 111, delta prime subunit, putative 

QRF Q1759 thymidylate kinase (tmk) 

ORF01773 ATP-dependant Clp proteaso, proteolytic subunit CipP (cIpF) 

ORFQ1774 uracil phosphoribosyltransferase (upp) 

ORF01777 RNA methyltransferase, TrmH family, group 2 
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QRF01781 conserved hypothetical protein TIGRQ0278 

QRF01782 ribosomal large subunit pseudouridine synthase B (rluB) 
ORF01783 conserved hypothetical protein T1GR0Q281 

ORFQ1784 conserved hypothetical protein 

QRF01785 integrase/recombinase, phage integrase family 

QRF01786 CBS domain protein 

ORFQ1787 conserved hypothetical protein 

ORF01788 HAM1 protein 

ORF01789 glutamate racemase (murl) 

QRFQ1791 membrane protein, putative 

ORF01 792 transcriptional regulator, biottn repressor family 

QRFQ1793 membrane protein, putative 

QRF01795 RNA methyltransferase, TrmH family 

QRF01796 acylphosphatase 

ORF01797 lipoprotein, putative 

ORFQ1799 amino acid ABC transporter, permease protein 

QRF01801 amidase family protein 

QRFQ1802 transcription elongation factor GreA (greA) 

QRF01 803 conserved hypothetical protein 

ORF01 804 acetyltransferase, GNAT family 

QRF01805 UDP-N-acetylmuramate--alanine ligase (murC) 

QRF01806 conserved hypothetical protein 

QRF01808 expressed putative helicase 

ORF01811 phosphoglycerate dehydrogenase-related protein 

ORF01812 primosomal protein Dnal (dnal) 

ORF01813 conserved hypothetical protein 

ORFQ1814 conserved hypothetical protein TIGR00244 
ORF01815 sensor histidine kinase CsrS (csrS) 
ORF01816 DNA-binding response regulator CsrR (csrR) 

ORFQ1817 conserved hypothetical protein 

ORF01818 heat shock protein HtpX (htpX) 

ORFQ182Q lemA protein (lemA) 

ORF01821 glucose-inhibited division protein B (gidB) 

ORF01822 sodium transport family protein 

ORF01823 potassium uptake protein, Trk family, putative 

ORF01825 ABC transporter, ATP-binding protein 

ORF01828 branched-chain amino acid transport system II carrier protein (brnQ) 

ORF01829 alcohol dehydrogenase, zinc-containing (adh) 

ORF01830 ABC transporter, permease protein 

ORF01831 ABC transporter, ATP-binding protein 

ORF01833 expressed YaeC family protein 

ORF01834 ABC transporter, substrate-binding protein 

ORF01835 glutamine amidotransferase, class I 

ORFQ1837 conserved hypothetical protein TIGRQ1033 

QRF01846 glycerol uptake facilitator protein (glpF) 

ORF01849 conserved hypothetical protein 

ORF01851 conserved hypothetical protein 

ORF01852 iojap-related protein 

ORFQ1854 conserved hypothetical protein TIGR00488 

ORFQ1855 conserved hypothetical protein TIGR00482 

QRF01856 conserved hypothetical protein T1GRQ0253 

ORF01857 GTP-binding protein 

ORF01668 hydrotaaa, haloaold dahaiogenaae»llke family ~ 

QRF0186Q glutamyi-tRNA(Gln) amidotransferase, B subunit (gatB) 

ORF01861 glutamyl-tRNA(Gln) amidotransferase, A subunit (gatA) 
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ORFQ1862 glutamyl-tRNA(Gln) am idotransf erase, C subunit (gatC) 

ORF01867 isochohsmatase family protein 

QRF01869 transcriptional regulator CodY, putative __ 

ORF01870 aminotransferase, class I 

ORF01871 universal stress protein family FRAMESHIFT 

ORFQ1872 hydrolase, haloacid dehalogenase-like family 

ORF01873 asparaginase family protein 

QRF01874 shikimate 5-dehydrogenase (aroE) 

ORFQ1876 ATP-dependent DNA helicase RecG (recG) 

ORF01878 alanine racemase (air) 

ORF01879 holo-(acyl-carrier-protein) synthase (acpS) 

ORF01881 preprotein translocase, SecA subunit (secA) 

ORF01882 mannose-6-phosphate isomerase, class I (manA) 

ORF01883 fructokinase (scrK) 

ORF01885 PTS system MABC components 

ORF01886 sucrose-6-phosphate hydrolase (scrB) 

ORF01887 sucrose operon repressor ScrR (scrR) 

QRF01888 N utilization substance protein B (nusB) 

ORF01889 conserved hypothetical protein 

ORF01890 translation elongation factor P (efp) 

ORFQ1900 cytidine/deoxycytidylate deaminase family protein 

ORF01906 excinuclease ABC, A subunit (uvrA) 

QRFQ19Q7 conserved hypothetical protein 

ORF01908 magnesium transporter, CorA family (corA) 

QRF019Q9 ribosomal protein S18 (rpsR) 

ORF01910 single-strand binding protein (ssb) 

ORF01911 ribosomal protein S6 (rpsF) 

ORF01912 A/G-specific adenine glycosylase (mutY) 

ORF01914 thioredoxin (trx) 

ORF01915 PAP2 family protein 

ORF01916 MutS2 family protein 

ORF01917 conserved hypothetical protein 

ORF01918 conserved hypothetical protein 

ORF01919 ribonuclease HIM (rnhC) 

ORF01920 signal peptidase I 

QRF01921 helicase, putative 

ORF01923 DNA-damage inducible protein P (dinP) 

ORF01924 formate acetyltransferase (pfID) 

ORFQ1926 conserved hypothetical protein 

ORF01927 proteinase, putative, degenerate, FRAMESHIFT 

ORF01929 glycerol uptake facilitator protein, putative 

QRF01930 universal stress protein family 

ORF01933 X-pro dipeptidyl-peptidase (pepX) 

ORF01937 ABC transporter, ATP-binding protein CydC (cydC) 

ORF01938 ABC transporter, ATP-binding protein CydD 

ORF01945 conserved hypothetical protein T1GROQ103 

ORFQ1948 exonuclease 

ORF01949 conserved hypothetical protein 

ORFQ195Q conserved hypothetical protein T1GR00275 

ORF01952 ribosomal protein S14 (rpsN) 

ORF01957 O-sialoglycoprotein endopeptidase family protein 

ORF01958 ribosomal-protein-alanine acetyltransferase, putative 

ORF01960 expressed protein of unknown function 

ORF01961 conserved hypothetical protein 

ORF01962 metallo-beta-lactamase superfamily protein 
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ORF01963 conserved hypothetical protein __ 

QRFQ1964 glutamine synthetase, type I (glnA) 

ORF01965 transcriptional regulator GinR (glnR) 

ORFQ1967 conserved hypothetical protein 

ORF01969 phosphoglycerate kinase (pgk) 

ORF01971 glyceraldehyde 3-phosphate dehydrogenase (gap) 

QRF01972 translation elongation factor G (fusA) . 

QRFQ1973 ribosomal protein S7 (rpsG) 

ORF01974 ribosomal protein S12 (rpsL) „ 

QRFQ1975 pur operon repressor (purR) . 

ORF01976 HP domain protein 

ORF01977 conserved hypothetical protein 

ORF01978 conserved hypothetical protein 

ORF01979 ribulose-phosphate 3-epimerase (rpe) , 

ORF01980 conserved hypothetical protein TIGR0Q157 

ORF01983 dimethyladenosine transferase (ksgA) 

ORF01985 primase-related protein . 

ORF01987 deoxyribonuclease, TatD family __ _ 

ORF01992 dltD protein (dltD) - 

ORF01993 P-alanyl carrier protein (dltC) 

ORFQ1994 dltB protein (dltB) 

QRF01996 P-alanine-activating enzyme (dltA) 

ORF01997 sensor histidine kinase 

ORF01998 PNA-binding response regulator _ m 

ORF01999 ribosomal protein L34 (rpmH) . 

ORF02Q04 amino acid ABC transporter, ATP-binding protein 

ORF02007 conserved hypothetical protein ( 

ORF02008 transcriptional antiterminator, BgiG family 

ORF02Q17 sugar binding transcriptional regulator, Lacl family 

QRF02018 transaldolase family protein 

ORF02019 carbohydrate isomerase, AraP/FucA family 

QRF02020 hexulose-6-phosphate isomerase, putative 

ORF02021 hexulose-6-phosphate synthase, putative 

ORF02022 PTS system, HA component 

ORFQ2023 PTS system, MB component . 

ORF02024 transport protein SgaT, putative 

ORF02027 adenylosuccinate synthetase (purA) 

ORFQ2033 chaperonin, 33 kPa (hslO) 

ORF02034 NifR3/Smm1 family protein 

ORF02037 ATP-dependent Clp protease, ATP-binding subunit 

ORF02038 transcriptional regulator CtsR (ctsR) 

ORF02040 translation elongation factor Ts (tsf) 

QRF02041 ribosomal protein S2 (rpsB) 

ORF02043 alkyl hydroperoxide reductase, subunit F (ahpF) ___ 

ORF02076 prophage LambdaSa2, single-strand binding protein (ssb) 

ORF02082 prophage LambdaSa2, type II PNA modification methyltransferase, putative 

ORF02086 prophage LambdaSa2, replicative PNA helicase (dnaC) 

ORF02104 endopeptidase O (pepO) 

ORF021 10 polypeptide deformylase (def) _ m 

QRF021 1 1 sugar binding transcriptional regulator RegR (regR) 

ORF02112 conserved hypothetical protein , 

ORF02113 PTS system, IIP component 

ORF02114 PTS «ytem, IIC component 

ORF02115 PTS system, MB component 

ORF02116 glucuronyi hydrolase 
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QRFQ2118 PTS system, HA component 

QRF0212Q oxidoreductase, short-chain dehydrogenase/reductase family ~ < 

ORF02121 conserved hypothetical protein 
ORF02122 carbohydrate kinase, PfkB family 

ORF02123 2-dehydro-3-deoxyphosphogluconate aldolase/4-hydroxy-2-oxoglutarate aldolase (eda) 
ORFQ2127 DNA polymerase III, alpha subunit, Gram-positive type 

ORF02129 prolyl-tRNA synthetase (proS) " 

ORF02130 membrane-associated zinc metalloprotease, putative 

ORF02131 phosphatidate cytidylyltransferase (cdsA) 

ORF02132 undecaprenyl diphosphate synthase (uppS) 

ORFQ2133 preprotein translocase, YajC subunit (yajC) 

ORF02140 glucan 1 ,6-alpha-glucosidase (dexB) 

ORF02141 sugar ABC transporter, ATP-binding protein (msmK) 

ORF02142 helix-turn-helix domain protein, fis-type 

ORF02144 tagatose 1 ,6-diphosphate aldolase QacD) 

ORFQ2145 tagatose-6-phosphate kinase (lacC) 

ORF02146 ga)actose-6-phosphate isomerase, LacB subunit (lacB) 

ORF02147 galactose-6-phosphate isomerase, LacA subunit (lacA) 

ORFQ2149 PTS system, HC component, putative 

ORFQ2150 PTS system, MB component, putative 

ORF02152 PTS system, IfA component, putative 

ORF02153 lactose phosphotransferase system repressor (lacR) 

ORF02157 adhesion lipoprotein 

ORF02158 expressed protein of unknown function TIGR00256 
ORF021 59 GTP pyrophosphokinase (relA) 

ORF02161 nrdl protein (nrdl) 

ORFQ2164 iron ABC transporter, iron-binding protein 
ORF02165 PNA-binding response regulator 
ORF02167 PTS system, IIP component 

ORF02168 PTS system, IIP component 

ORF02174 ABC transporter, ATP-binding protein 

ORF02176 response regulator 
ORF02177 conserved hypothetical protein 
ORF02178 PTS system, IIABC components 

ORF02179 sensor histidine kinase 

ORF02180 phosphate regulon response regulator PhoB (phoB) 
ORF02182 phosphate ABC transporter, ATP-binding protein (pstB) 
ORFQ2183 phosphate ABC transporter, permease protein 
ORF02184 phosphate ABC transporter, permease protein 
ORF02188 conserved hypothetical protein T1GR00046 
ORF02189 ribosomal protein L11 methyltransferase (prmA) 
ORF02197 conserved hypothetical protein 
ORF021 99 ATPase, AAA family 

ORF02249 mercuric reductase (merA) 

ORF02272 DNA topology modulation protein FlaR, putative 
ORF02273 glycerol dehydrogenase, putative 

ORF02281 PNA-binding response regulator 

ORF02285 leucyl-tRNA synthetase (leuS) 

ORFQ229Q transcription antitermination protein NusG (nusG) 

ORF02293 penicillin-binding protein 2A (pbp2A) 

QRF02294 ribosomal large subunit pseudouridine synthase, RluD subfamily 
ORF02296 phosphopentomutase (deoB) 
ORF022Q7 deoxyrlbose-phoaphato aldolase (deoC) 

QRF023QQ uridine phosphorylase (udp) 

ORF02302 60 kda chaperonin (groEL) 
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ORF02303 chaperonin, 10 kDa (groES) , _ 

ORF023Q5 ABC transporter, ATP-binding protein 

QRF02306 ABC transporter, permease protein 

ORF02307 expressed putative lipoprotein 

ORFQ2309 glyoxalase family protein 

ORF02310 conserved hypothetical protein , 

ORF0231 1 anaerobic ribonucleoside-triphosphate reductase activating protein (nrdG) 

ORFQ2312 acetyitransferase, GNAT family 

ORF02315 anaerobic ribonucieoside-triphosphate reductase (nrdD) 

ORF02318 conserved hypotheticai protein 

ORF02320 conserved hypothetical protein 

ORF02321 conserved hypothetical protein 

ORF02322 recA protein (recA) 

ORF02325 DNA-3-methyladenine glycosylase I (tag) 

ORF02327 Holliday junction DNA helicase RuvA (ruvA) 

QRFQ2329 DNA mismatch repair protein HexB (hexB) 

ORF02333 arginine repressor ArgR, putative 

ORF02334 arginyl-tRNA synthetase (argS) 

QRFQ2337 conserved hypothetical protein _ 

QRF02338 conserved hypothetical protein 

QRF02339 aspartyl-tRNA synthetase (aspS) 

ORF02340 histidyl-tRNA synthetase (hisS) 

ORF02342 ribosomal protein L33 (rpmG) 

QRF02357 DNA-binding response regulator 

ORF02359 membrane protein, putative _ 

ORF02360 carbamate kinase (arcC) _ 

ORF02361 ornithine carbamoyltransferase (argF) 

ORF02364 amino acid ABC transporter, ATP-binding protein 

ORF02365 amino acid ABC transporter, permease and amino acid-binding protein 

ORF02370 membrane protein, putative 

QRF02371 transcriptional regulator, TetR family, putative 

ORF02373 ribosomal protein S4 (rpsD) 

ORF02374 conserved hypothetical protein 

ORF02375 replicative DNA helicase (dnaC) , 

QRF02376 ribosomal protein L9 (rpll) 

ORF02377 DHH family protein _ 

ORF02378 glucose inhibited division protein A (gidA) 

ORF02380 tRNA (5-methylaminomethyl-2-thiouridylateVmethyltransferase (trmU) 

ORF02381 L-serine dehydratase, iron-sulfur-dependent, beta subunit (sdhB) 

QRF02382 L-serine dehydratase, iron-sulfur-dependent, alpha subunit (sdhA) 

ORFQ2385 cobalt transport family protein __ 

ORF02386 ABC transporter, ATP-binding protein 

ORF02387 ABC transporter, ATP-binding protein, FRAMESHIFT _ 

ORF02388 CDP-diacylglycerol-glycerol-3-phosphate 3-phosphatidyltransferase (pgsA) 

ORF02389 peptidase, M16 family 

ORF02390 conserved hypothetical protein 

QRF02391 conserved hypotheticai protein : __ 

QRFQ2392 recF protein (recF) 

QRF02396 inosine-S'-monophosphate dehydrogenase (guaB) 

ORFQ2397 transcriptional regulator, ArgR family 

ORF024Q0 arginine deiminase (arcA) 

ORF02402 ornithine carbamoyltransferase (argF) 

ORF024Q4 carbamate kinase (arcC) . 

ORF024Q5 tryptophanyl-tRNA synthetase (trpS) 

ORFQ2407 conserved hypothetical protein _ . 



496 



WO 2004/018646 



PCT/US2003/026827 



Table 8: GBS genes shared with GAS and pneumococcus 

ORFxxxxx Annotation , 

ORFQ2408 ABC transporter, ATP-binding protein 

ORFQ24Q9 ABC transporter, permease protein, putative 

ORF02410 conserved hypothetical protein T1GR00246 

ORFQ241 1 serine protease I 

ORF02412 partitioning protein, ParB family 

ORFQ2413 chromosomal replication initiator protein DnaA (dnaA) 

QRF02415 DNA polymerase ill, beta subunit (dnaN) _ 

ORF02417 conserved hypothetical protein 

ORF02419 conserved hypothetical GTP-binding protein 

ORFQ2420 peptidyl-tRNA hydrolase (pth) 

ORF02421 transcription-repair coupling factor (mfd) __ 

QRF02423 S4 domain protein 

ORF02424 cell division protein DivIC, putative m 

ORFQ2426 expressed protein of unknown function 

QRF02427 MesJ/Ycf62 family protein 

ORF02429 cell division protein FtsH (ftsH) ____ 
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ORFxxxxx Annotation 

ORF00017 phosphoribosylaminoimidazolecarboxamide formyltransf erase/I MP cyclohydrolase (purH) 
ORF00025 conserved hypothetical protein 
ORF00029 acetyl xylan esterase, putative 
ORF00042 aldehyde-alcohol dehydrogenase (adhE) 

ORF00044 threonine synthase (thrC) ~ ~ 
ORF00081 ribosomal protein L17 (rpiQ) 
ORF0009Q conserved hypothetical protein 
ORF00129 argininosuccinate synthase (argG) 

ORF00156 oligopeptide ABC transporter, substrate-binding protein, putative 

ORF0Q189 protease, putative 

ORFQ0194 thioredoxin family protein 

ORFQ0195 tRNA binding domain protein 

ORFQQ21 7 conserved domain protein __ 

ORF00218 PTS system, HB component, putative ' '"" 

ORF00220 transketolase, N-terminal subunit 

ORF0Q221 transketolase, C-terminal subunit 

ORF00223 oxidoreductase, putative 

ORF00282 acetyltransferase, GNAT family 

ORF0029Q IS1381, transposase OrfB 

ORF00291 IS1381, transposase OrfA 

ORF00293 conserved hypothetical protein 

ORF0Q301 membrane protein, putative ~ 
ORF00343 ABC transporter, permease protein, putative ~ 
ORF0Q344 conserved hypothetical protein ~ 
ORF00382 aspartate kinase family protein 
ORF00399 conserved hypothetical protein 
ORF00439 cell wall surface anchor family protein 
ORF00447 cytidine/deoxycytidylate deaminase family protein 
ORF00450 5-formyltetrahydrofolate cyclo-ligase family protein 

ORF00480 transcriptional regulator, MerR family 

ORF00499 acetyltransferase, GNAT family 

QRF00504 magnesium transporter, CorA family "~ 
ORF00521 VanZF domain protein 
QRF00612 1S1381 , transposase OrfA 

QRF00613 IS1 381, transposase OrfB 

ORF00690 transmembrane protein Vexpl (vexl) ~ 
ORF00691 ABC transporter, ATP-binding protein Vexp2 (vex2) 

ORF00692 transmembrane protein Vexp3 (vex3) "'" 
ORF00714 conserved hypothetical protein 

ORF00732 expressed cell wall surface anchor family protein, putative 

ORF00774 ABC transporter, ATP-binding protein 
ORF00778 ABC transporter, ATP-binding protein 

ORF00780 conserved hypothetical protein — — 

ORF00790 beta-glucuronidase 

QRF00800 alpha amylase family protein ■- — — — 

ORF00807 amino acid ABC transporter, permease protein 

QRF00809 amino acid ABC transporter, amino acid-binding protein *~ 
ORF00814 conserved hypothetical protein 
ORF00823 bacterial luciferase family protein 
ORFQ0840 riboflavin biosynthesis protein RibD (ribD) 

ORFQ0841 riboflavin synthase, alpha subunit (ribE) ~~ 
ORF00842 riboflavin biosynthesis protein RibA (ribA) 

QRF0Q843 riboflavin synthase, beta subunit (ribH) 

ORF00866 penicillin-binding protein 2b 

QRF00905 membrane protein, putative 
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ORFxxxxx Annotation 

ORF0091 0 major facilitator family protein __ 

QRF00913 hydrolase, haloacid dehalogenase-like family 

ORF0091 8 conserved hypothetical protein 

ORF0Q945 conserved hypothetical protein 
ORF00948 ABC transporter, ATP-binding protein 

ORF00952 phosphomethylpyrimidine kinase (thiP) 

QRF00953 hydroxyethylthiazole kinase (thiM) 

ORF00954 thiamine-phosphate pyrophosphorylase (thiE) 

ORF00961 GtrA family protein 

ORF00967 1 ,4-aipha-glucan branching enzyme (glgB) 

ORF00968 glucose- 1 -phosphate adenylyltransferase (glgC) 

ORF00971 glycogen synthase (glgA) 

ORF00985 acetyltransferase, GNAT family 

ORF00990 magnesium transporter, CorA family, putative 
ORF01022 nucleoside diphosphate kinase (ndk) 
ORF01031 nucleoside diphosphate kinase domain protein 

ORF01085 conserved hypothetical protein 

ORF01087 IS1 381, transposase OrfA 

QRF01088 1S1381, transposase OrfB 

ORFQ1098 ABC transporter, permease protein, putative 

ORF01 100 sensor histidine kinase __ 

QRF01 102 ABC transporter, substrate-binding protein __ 

ORF01127 protease, putative 

ORF01 135 iron compound ABC transporter, permease protein 

ORF01 136 iron compound ABC transporter, permease protein 

ORF01185 aspartate-semialdehyde dehydrogenase (asd) 

ORF01217 conserved hypothetical protein 

ORF01218 conserved hypothetical protein ___ 

ORF01219 formate/nitrite transporter family protein 

ORF01226 oxidoreductase, short chain dehydrogenase/reductase family, FRAMESHIFT 

ORF01254 homoserine kinase (thrB) 

QRF01255 homoserine dehydrogenase (horn) 

QRF01264 transcriptional regulator, Cro/Cl family 

ORF01268 thiol peroxidase (psaD) 

ORF01305 glycosyltransferase CpsJ(V) (cpsJ) 

ORF01306 glycosyltransferase CpsO(V) (cpsO) __ 

ORF01313 CpsD protein (cpsD) 

ORF01314 cpsC protein (cpsC) 

ORF01315 capsular polysaccharide biosynthesis protein CpsB (cpsB) 

ORF01316 capsular polysaccharide biosynthesis protein CpsA (cpsA) 

ORF01326 conserved hypothetical protein 

ORF01333 alpha-acetolactate decarboxylase (budA) 

ORF01334 acetolactate synthase, catabolic (ilvK) 

ORF01337 MutT/nudix family protein 

ORF01369 MATE efflux family protein 

ORF01398 Tn5252, Orf 9 protein 

ORF01399 Tn5252, Orf 10 protein 

ORF01446 protease, putative 

ORF01447 conserved hypothetical protein 

ORF01449 conserved hypothetical protein 

QRF01492 NAPP-specific glutamate dehydrogenase (gdhA) 
QRF01569 expressed cell wall surface anchor family protein 

ORF01570 cell wall surface anchor family protein 

ORF01574 polysaccharide biosynthesis protein 

QRF01579 nucleotidyl transferase, putative 
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